17

I would like to calculate the sum of values in one column, based on values in another column. To illustrate my question I uploaded the picture below. The second column consists of zipcodes, the third total sales. I would like to know what the total sales per zipcode are? So in this case:

What is the sum of column three for column two's value 1023? Output then should be 4 (0+1+1+2+0). The sum for 1024 then should be 11,5 following the same logic.

As this is quite a large dataset I would like to have the QGIS field calculator (or any other tool) calculate a new field which states the total sales in the zipcode. So far I can't seem to get it working and was hoping someone here could help me.

As you can see I have some missing values, is this a problem in calculation? Perhaps good to mention is that it is a table.

Example

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Ammar
  • 171
  • 1
  • 1
  • 4

3 Answers3

25

Update: QGIS 2.18 (07/07/2017):

In the Field Calculator, you can use the Aggregate function sum which allows you to sum the values of a column based on an expression. You can also, optionally, group and filter features for summation. So in your field calculator, you could simply use:

sum( "Sum", "Sales")

Which does exactly the same as the method using the GroupStats plugin.



Original post: 02/03/2016

Use the GroupStats plugin which you may need to download from Plugins > Manage and Install Plugins.

Here's an example layer using some of the attributes you have shown:

Layer

So let's:

  1. Run the GroupStats plugin, select the relevant fields for summing (Sales for rows; Sum for values) and then click Calculate. You should see the results showing the sum of each sale:

    GroupStats plugin

  2. Save the results as a .csv file from the GroupStats toolbar:

    Export to csv

  3. Import the .csv file back into QGIS either by dragging it to the interface or from the toolbar:

    Layer > Add Layer > Add Delimited Text Layer (note that the values are separated by semi-colons)

  4. Do a Joins with the shapefile and the .csv file, matching the Sales fields:

    Joins

  5. The shapefile should now contain the sum values from the .csv file:

    Results

You could, if you wish, proceed to use the Save As... option on the shapefile and save it as a new one to keep the joined fields.

Joseph
  • 75,746
  • 7
  • 171
  • 282
  • It would be helpful if the field calculator component of this answer included the SQL lines which group and/or filter. @Joseph – eyerah Jan 31 '18 at 17:58
  • @eyerah - The expression sum( "Sum", "Sales") uses "Sales" as a group filter, if that is what you mean? – Joseph Feb 01 '18 at 10:09
  • 1
    Oh. I see that now. That is what I meant. Thanks. – eyerah Feb 03 '18 at 22:59
12

If you update to the very latest QGIS version 2.14, you can use virtual layers making a sql statement like:

SELECT "MYTABLE"."MYID", Sum("MYTABLE"."SALES")  AS zip_sale /*:int*/
FROM "MYTABLE"
GROUP BY "MYTABLE"."ZIPCODE"

The UI looks like:

enter image description here

You can also use the DB manager, where you have a query builder behind the sql button:

enter image description here

The MYID column can be used to join the summed sales back to the main table. The join tab is under tabel properties.

Take a look at the QGIS visual changelog under Feature: Virtual layers

Jakob
  • 7,471
  • 1
  • 23
  • 41
6

A way of doing this is the following:

  • In the attribute table, choose select by feature. Type "column-name" = 1023. All fields with the value 1023 are selected.
  • Go to vector tools --> Analysing tools --> basic statistics. Choose your layer and the column with the sales. Make sure to check the "selected values only" check-box. The sum will be displayed in the result window. You have to write it down ;)
  • repeat the steps for every zip-code
PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Miron
  • 1,231
  • 7
  • 20
  • 1
    P.S. Missing values are no problem. – Miron Mar 02 '16 at 09:26
  • This is a simple and well-working approach. However it might be very time consuming when processing a large data set as Ammar mentioned... – dru87 Mar 02 '16 at 09:27