
Rachael F. answered 06/02/21
Data Science Python Expert, 6+ Years of Professional Experience
Hey Andrew!
A group by request like this fits well as a pandas DataFrame to represent and manipulate the data.
What I imagine the data to look like:
| country | invoice_amount |
| Peru | 1200.09 |
| Germany | 500.0 |
| Germany | 6712.45 |
If data like this is read into a Pandas DataFrame with "country" and "invoice_amount" as the columns (invoice_df below), we can perform a GroupBy method on the data to aggregate the invoice amounts by country
Now, invoiced_by_country should look just like the table required. For every country, we get the count of invoices ('Num_Invoices'), the average value of an invoice ('Avg_Invoice_Total') and the summation of all of the invoice values ('Total_Amount')
As a note though, this function rounds the data to two decimal places. But, I'm confused with the wording above. It says to both round to "at most 2" decimal places and also "no more and no less" than 2 places for decimals. These are rather different requests. But, the code as shown, will not show extra zeros at the end of a number, following the "at most 2" rule. So, for example, 2.1 will be represented as 2.1 instead of 2.10.