Module 1: What is Puppet and why use it?
1. The problem: the era of “artisan server crafting”
In the traditional model of systems administration, every server, database, and network configuration was created and managed by hand. This manual approach, often called “artisan server crafting,” is complicated, tedious, and time-consuming. As an infrastructure grows, several critical issues emerge:
- Configuration drift: Servers that start identical inevitably become slightly different over time as manual tweaks are applied. This leads to “snowflake servers” or special systems that cannot be easily replicated or repaired because their unique configuration isn’t documented.
- Human error: Humans are not naturally good at accurately repeating complex tasks across hundreds of machines. Typos and missed steps are common, leading to outages and security vulnerabilities.
- Scalability limitations: While one sysadmin can manage a handful of servers by hand, the artisan approach becomes unmanageable once an infrastructure reaches 10 or more servers.
- Documentation gaps: Documentation is rarely updated in real-time. The only truly accurate documentation is the server itself, but if that server fails, the knowledge of its setup may be lost.
2. Introduction to configuration management
Configuration Management (CM) is the solution to these challenges. It replaces endless manual commands with simple lines of code. Instead of giving instructions on how to do something (procedural), you define the desired state of the system (declarative).
The goal of CM is to automate the software delivery process, ensuring that systems are consistently configured and up to date. This allows IT staff to spend less time on routine drudgery and more time on high-value improvements.
3. Puppet in the IaC landscape
Puppet fits into the broader Infrastructure as Code (IaC) landscape as a tool for managing the lifecycle of a server. IaC treats infrastructure with the same rigor as software development, utilizing version control, automated testing, and continuous integration.
Puppet models the system as a collection of Resources. A resource is an atomic unit of configuration, such as a file, a user account, or a software package. For a resource to be effectively managed by Puppet, it must be:
- Unique: Distinguished from all other resources (e.g., a specific file path).
- Searchable: Puppet must be able to determine its current state on the server.
- Atomic: It cannot be broken down into smaller, managed components.
- Creatable and destroyable: Puppet must have the logic to both bring the resource into existence and remove it.
4. Puppet architecture: agent/server vs. masterless
Puppet primarily operates using an Agent/Server (Master) architecture, though it supports alternative models.
- The pull model (Standard): A central Puppet Master stores the authoritative configurations (manifests). The Puppet Agent software is installed on every managed node. Periodically (often every 30 minutes), the agent checks into the master, pulls down its specific configuration “catalog,” and applies it locally.
- Scaling the master: For large environments (thousands of nodes), the workload of compiling catalogs can bog down the master. To scale, Puppet masters are often run under Passenger (via Apache or Nginx) to handle increased HTTP SSL transactions.
- Masterless configuration: In this model, the Puppet code is pushed directly to the nodes (often via Git), and the
puppet applycommand is run locally. This eliminates the single point of failure of a central master and is suitable for certain distributed environments.
5. Comparing the big three: Puppet, Ansible, and Chef
While all three are titans of the configuration management world, they differ in philosophy and execution:
| Feature | Puppet | Ansible | Chef |
|---|---|---|---|
| Language Style | Declarative: You define the end state. | Imperative: You define the steps and order. | Imperative/Procedural: Step-by-step Ruby code. |
| Architecture | Agent-based: Requires agent software on nodes. | Agentless: Uses SSH to run commands. | Agent-based: Requires Chef client on nodes. |
| Centralization | Typically requires a Master server. | Masterless: Runs from a laptop or CI server. | Typically requires a Chef Server. |
| Philosophy | Eventual Consistency: Re-runs ensure stability. | Direct Orchestration: Great for multi-tier tasks. | Infrastructure as Code: High flexibility via Ruby. |
Ansible is often preferred for rapid orchestration because it does not require bootstrapping an agent on every server. However, Puppet and Chef are often considered more robust for Continuous Configuration Synchronization, where the tool runs unattended to automatically revert manual changes and prevent drift.
6. When is Puppet the right tool for the job?
Puppet is the ideal choice when your goal is to standardize a massive environment and ensure it stays in a “known good” state.
- Drift prevention: Because the Puppet agent runs on a schedule, it acts as an automated “policeman” that identifies and corrects unauthorized changes to system files or services.
- Standardized environments: Puppet allows you to define Roles and Profiles, ensuring that all web servers or database servers in your fleet are 100% identical.
- Cross-platform consistency: Puppet abstracts the differences between operating systems. You can write one manifest to “ensure Apache is installed,” and Puppet will automatically use
apton Ubuntu andyumon Red Hat to make it happen. - Self-updating documentation: In a Puppet-managed environment, the code is the documentation. It is guaranteed to be up to date because it is what builds the live infrastructure.
- Scaling large teams: Through tools like Hiera, Puppet allows teams to separate code from data. This means senior engineers can write reusable modules while others simply update YAML files with specific node data.
Using Puppet is like moving from being a bricklayer (manually placing every component) to being an architect who uses prefabricated wall panels. You define the blueprint once, and the panels (Puppet code) rise quickly and flawlessly every time, regardless of which building site (server environment) you are on.
Getting started with Puppet: A beginner to production tutorial series index
