Pivot Tables
A pivot table answers one question: how does this number break down across these categories? This guide builds the intuition for what pivot tables do, how to read them, and how to construct them from a question.
The one question a pivot table answers
You have a long flat list of rows, like contracts, crimes, patients, transactions. You want to know: how many rows match each category? or what is the total value for each group?
That is all a pivot table does. It collapses many rows into a summary, grouped by one or more categories.
You could do the same thing by hand: sort by ministry, count the rows, write down the number, repeat. A pivot table does it instantly, and lets you change the grouping without starting over.
From flat list to grouped summary
Start with a flat list of procurement contracts. Each row is one contract.
Tidy data: one row per contract
| ministry | year | value (€) |
|---|---|---|
| Health | 2022 | 120,000 |
| Finance | 2022 | 45,000 |
| Health | 2023 | 89,000 |
| Finance | 2023 | 210,000 |
| Health | 2022 | 55,000 |
| Finance | 2023 | 30,000 |
Pivot: count of contracts per ministry
| ministry | count |
|---|---|
| Finance | 3 |
| Health | 3 |
The six rows collapsed into two, one per ministry. The pivot table counted how many times each ministry appeared.
But you can ask a different question of the same data. Let's ask how much instead of how many:
Same tidy data
| ministry | year | value (€) |
|---|---|---|
| Health | 2022 | 120,000 |
| Finance | 2022 | 45,000 |
| Health | 2023 | 89,000 |
| Finance | 2023 | 210,000 |
| Health | 2022 | 55,000 |
| Finance | 2023 | 30,000 |
Pivot: total spending per ministry over the years
| ministry | total (€) |
|---|---|
| Finance | 285,000 |
| Health | 264,000 |
The years dimension disappeared and a we got a sum over the years. Pretty neat!
The four fields
Every pivot table tool (Excel, Google Sheets, Python, R) organises the same four fields. Once you understand what each one does, the tool doesn't matter.
Values: what you want to measure
This is a numeric column. You choose an aggregation: count (how many rows), sum (add them up), average, max, min. Count doesn't need a numeric column, it just counts rows. Sum, average, max, and min do.
Example: Sum of value, or Count of contract_id.
Rows: what you want to group by
A categorical column. The pivot table creates one output row per unique value in this column.
Example: ministry → one row per ministry.
Columns: a second grouping, spread sideways
Optional. A second categorical column whose unique values become column headers. This is what gives the pivot table its name. You are pivoting a category from rows into columns. The result is a cross-tabulation.
Example: year → columns become 2022, 2023, 2024.
Filters: zoom in on a subset
Restrict the data before summarising. The pivot table runs on only the rows that match your filter.
Example: Filter to year = 2023 before counting contracts.
Adding a second dimension with Columns
Here is what happens when you move year from Rows to Columns:
Rows: ministry & year: two grouping columns, stacked
| ministry | year | total (€) |
|---|---|---|
| Finance | 2022 | 45,000 |
| Finance | 2023 | 240,000 |
| Health | 2022 | 175,000 |
| Health | 2023 | 89,000 |
Cross-tabulation. Rows: ministry, Columns: year
| ministry | 2022 | 2023 |
|---|---|---|
| Finance | 45,000 | 240,000 |
| Health | 175,000 | 89,000 |
Both tables contain the same information. The cross-tabulation on the right is easier to scan for comparisons across years. It is also the form journalists most often see in press releases and reports.
Use the stacked form (left) when you have many years or want to sort by total. Use the cross-tab (right) when you want to compare two categories against each other at a glance.
Count vs. sum
These answer different questions. Getting them mixed up produces wrong numbers.
Count: how many contracts did each ministry sign?
| ministry | count of contracts |
|---|---|
| Finance | 3 |
| Health | 3 |
Counts rows. Does not care about the value column.
Sum: how much did each ministry spend?
| ministry | total value (€) |
|---|---|
| Finance | 285,000 |
| Health | 264,000 |
Adds up the value column. A ministry with one huge contract scores higher than one with many small ones.
Both are valid, they just answer different questions. Before you build a pivot table, ask: am I measuring frequency (count) or magnitude (sum)?
From question to pivot table
Work backwards from your question. Each part of the question maps to a field.
| Question | Rows | Columns | Values |
|---|---|---|---|
| How many contracts per ministry? | ministry | count | |
| Total spending per ministry per year? | ministry | year | sum of value |
| Average contract size per region? | region | average of value | |
| Number of crimes per neighbourhood per year, only violent crimes? | neighbourhood | year | count |
| Which supplier received the largest single contract? | supplier | max of value |
If you cannot map your question to these fields, the question may be too complex for a single pivot table, or the data may need to be tidied first.
What pivot tables need to work
Pivot tables only work well on tidy data. If your data has these problems, fix them first:
- Merged cells: a merged region header makes the entire column unusable as a Row or Column field. Unmerge and fill down first.
- Inconsistent values: "Netherlands", "The Netherlands", and "NL" will appear as three separate groups. Standardise categorical columns before pivoting.
- Numbers stored as text: a value column with currency symbols or spaces will return 0 when you try to Sum it. Strip the non-numeric characters first.
- Subtotal rows mixed into data: these will be counted and summed alongside the real data, inflating every result.
See the Tidy Data guide.