What is Data Analysis and Visualization?
Data analysis involves collecting, organizing, and analyzing raw data to extract relevant information. This process is crucial for making data-driven decisions. By analyzing data, we gain valuable insights and ensure accuracy in decision-making.
On the other hand, visualization refers to representing data graphically. Various types of charts help us understand data patterns, trends, and outliers. Visualization makes raw data more accessible and comprehensible.
Excel offers powerful features and functions for data analysis and visualization. Let’s explore the steps involved:
1. Data Processing
Data processing is the initial step in data analysis and visualization. It includes the following segments:
1.1. Data Collection
First, you have to collect information. The information should be collected in a systematic way from a trustworthy source. The source can be any database, online platform, or any web page. Then this data needs to be further processed.
1.2. Data Organization
After collecting the data, the next step is to organize the data in such a way that it is easily understandable to people. The data can be arranged in a tabular form with a clear title and headings.
1.3. Data Cleaning
Clean the organized data to remove blank spaces, unwanted empty cells, and duplicate values. Data cleaning ensures an accurate and error-free dataset.
For our example, we’ve processed a dataset containing sales quantities of products (such as mobiles, laptops, and televisions) for a company. The sales data is provided on a monthly basis, and we’ll use this dataset for data analysis and visualization in Excel.
2. Data Analysis
Excel offers various techniques for data analysis. Let’s explore some useful functions and their applications:
2.1. Using Excel Functions
Excel provides numerous functions for data analysis. Here are a couple of examples:
2.1.1. Using the SUM Function
To find the total number of mobiles, laptops, and televisions sold in a year:
- Go to cell C17 and enter the following formula:
=SUM(C5:C16)
- Select cell C17 and use Fill Handle to AutoFill data in range D17:E17.
2.1.2. Using INDEX & MATCH Functions
To find the sales quantity of a particular product for any month:
- Enter the name of any month in cell G5.
- In cell H5, enter the following formula:
=INDEX(B5:E16,MATCH(G5,B5:B16,0),2)
- This will display the sales quantity of mobiles for that specific month.
- You can change the month name in cell G5 and the sales quantity of mobile will be shown in cell H5.
Formula Breakdown
MATCH(G5,B5:B16,0): The MATCH function searches for the value in cell G5 within the specified range B5:B16. The 0 indicates an exact match.
Result: 7
INDEX(B5:E16,MATCH(G5,B5:B16,0),2): The INDEX function uses the row number returned by the MATCH function. It retrieves the value from the corresponding row number and the second column (2) of range B5:E16.
Result: 57
2.1.3. Using VLOOKUP & MATCH Functions
You can combine the VLOOKUP and MATCH functions to retrieve the sales quantity of any product for a specific month:
- Go to cell H4 and enter the name of any month.
- In cell H5, put the product name.
- In cell H6, enter the following formula:
=VLOOKUP(H4,B5:E16,MATCH(H5,B4:E4,0))
This will display the sales quantity of the specified product for the given month.
Formula Breakdown
MATCH(H5,B4:E4,0): The MATCH function searches for the value in cell H5 within the specified range B4:E4. The 0 indicates an exact match.
Result: 3
VLOOKUP(H4,B5:E16,MATCH(H5,B4:E4,0)): The VLOOKUP function searches for the value in cell H4 in the leftmost column of the range B5:E16. Once it finds a match, it returns the value from the same row and the column specified by the MATCH function output.
Result: 272
2.1.4. Combining SUM & SUMIFS Functions
To determine the total number of products sold for any month:
- Enter the month name in cell G5.
- In cell H5, enter the following formula:
=SUM(SUMIFS(C5:C16,B5:B16,G5), SUMIFS(D5:D16,B5:B16,G5), SUMIFS(E5:E16,B5:B16,G5))
- This will show the overall product count for that month.
Formula Breakdown
SUMIFS(C5:C16,B5:B16,G5): Sums the values in range C5:C16 where the corresponding cells in range B5:B16 match the value in cell G5.
Result: 125
SUMIFS(D5:D16,B5:B16,G5): Sums the values in range D5:D16 using the same criteria.
Result: 272
SUMIFS(E5:E16,B5:B16,G5): Sums the values in range E5:E16 using the same criteria.
Result: 147
SUM(SUMIFS(C5:C16,B5:B16,G5),SUMIFS(D5:D16,B5:B16,G5),SUMIFS(E5:E16,B5:B16,G5)): Adds the results of the three SUMIFS functions, considering only matching values in range B5:B16.
Result: 544
2.1.5. Apply MAX, INDEX & MATCH Functions
To find the maximum quantity sold in a month for a specific product:
- Go to cell G6.
- Enter the following formula to find the maximum number of laptops sold:
=MAX(D5:D16)
- Then, in cell H6, find the name of that month using:
=INDEX(B5:E16,MATCH(G6,D5:D16,0),1)
2.2. Apply Conditional Formatting
Apply conditional formatting to understand the relative quantity of individual products sold over 12 months:
- Select range C5:C16.
- Go to the Home tab, choose Conditional Formatting, and select Data Bars.
- Pick a fill color to highlight the cells.
- Repeat the same steps for ranges D5:D16 and E5:E16.
2.3. Using Pivot Table
- Go to the Insert tab, select Pivot Table, and choose From Table/Range.
- Set the dialog box as shown below:
- Table/Range: ‘Pivot Table’!$B$4:$E$16
- Location: Cell G4 of the Existing Worksheet
- Click OK.
- Arrange the PivotTable Fields as needed to get the grand total of the products.
- You will have the pivot table with the grand total of the products.
2.4. Sorting a Product
Sort any product in ascending or descending order. For example, to sort the dataset in descending order based on mobile sales quantity:
- Select range C5:C16.
- Go to the Home tab, click on Sort & Filter and select Sort Largest to Smallest.
- Select Expand the selection and press Sort.
- The dataset will now be sorted by mobile phone sales quantity in descending order.
2.5. Filtering Data
- Select range B4:E16.
- Go to the Home tab, click Sort & Filter, and choose Filter.
- Filter icons will appear in all the columns.
- Click the filter icon in the Month column.
- Choose the desired month and click on OK.
- You will see the filtered data of that specific month.
2.6. Using the What-If Analysis Feature
Excel’s What-If Analysis combines techniques and tools to predict the impact of formulas and models. It includes the following features:
- Goal Seek: Predicts a parameter value to achieve a specific output. Specify the target value, and Excel determines the input value accordingly.
- Scenario Manager: Define different scenarios for various inputs and observe how the output varies.
- Data Table: Create tables to evaluate multiple results based on input values. Use one or two variables to analyze their impact on the output.
Let’s demonstrate the Goal Seek feature. Given that the total units of televisions sold is 1564, we want to generate a revenue of $1,564,000. We’ll find the unit price of televisions required to reach this revenue.
- In cell H7, enter the formula:
=H5*H6
- Go to the Data tab, select What-If Analysis, and choose Goal Seek.
- Set the Goal Seek dialog box as shown below.
- You’ll get the unit price in cell H6 required to achieve the desired revenue of $1,564,000.
2.7. Applying Data Validation
To find the sales quantity of laptops for any month:
- Select cell G5.
- Go to the Data tab, choose Data Validation.
- In the Data Validation dialog box:
- Select List as the validation criteria.
- Set the source location to:
=$B$5:$B$16
- Click OK.
- Cell G5 now has a dropdown list of months.
- In cell H5, enter the formula:
=INDEX(B5:E16,MATCH(G5,B5:B16,0),3)
- This will display the sales quantity of laptops for the selected month.
2.8. Using an Excel Table
- Select the range B4:C16.
- Go to the Insert tab and choose Table.
- Check the My table has headers option.
- Press OK.
- The table will be created.
- Select the Total Row from the Table Design tab.
- Cell C17 will show the total number of televisions.
- You can also use the dropdown in cell C17 to choose other parameters.
2.9. Using the Analyze Data Feature
- Go to the Home tab and select Analyze Data.
- Explore different options, such as pivot tables and charts, to analyze your dataset.
If you don’t see the Analyze Data option in your Home tab, customize it:
- Right-click on the Home tab and choose Customize Ribbon.
- Add a New Group, set its position, select All Commands, find Analyze Data in the list, and add it to the newly created group.
2.10. Using the Analysis ToolPak Add-in
Activate the Analysis ToolPak add-in for a wide range of analysis:
- Go to the File tab.
- Select Options.
- In the Excel Options dialog box, choose Add-ins.
- Under Manage, select Excel Add-ins and click Go.
- Check the option for Analysis ToolPak and click OK.
2.10.1. Analyzing Descriptive Statistics
Let’s analyze the sales quantity of laptops:
- Go to the Data tab and select Data Analysis.
- Choose Descriptive Statistics and press OK.
- Descriptive Statistics dialog box will appear.
- Set the input range, output range, and other properties as needed.
- You’ll see the descriptive statistics for laptops in your worksheet.
2.10.2. Analyzing Anove (Analysis of Variance): Single Factor
ANOVA (Analysis of Variance) is a statistical method used to compare the means of two or more groups and determine if there are significant differences between them. Here’s how you can perform ANOVA in Excel:
- Go to the Data tab.
- Click on Data Analysis.
- In the Data Analysis dialog box, select ANOVA: Single Factor and press OK.
- Set the input range, output range, and other properties as needed.
- The ANOVA analysis results will appear in your worksheet.
3. Data Visualization
After data analysis, visualizing the dataset helps us understand patterns and identify outliers. Let’s create some charts:
3.1. Excel Column Chart
- Select the range B4:E16.
- Go to the Insert tab and choose a suitable column chart.
- Customize the chart title as needed.
3.2. Excel Bar Chart
- Select the same range (B4:E16).
- Insert a suitable bar chart (got to the Insert tab and click on the columns icon drop-down).
- Add an appropriate title.
3.3. Excel Line Chart
- Select the range B4:E16.
- Click on the line chart icon drop-down.
- Choose Line with Markers under the 2-D Line category.
- Provide a relevant title.
3.4. Excel Pie Chart
- Select the range B4:C16.
- Insert a 2-D or 3-D pie chart.
- The pie chart will display the sales quantity of mobiles.
3.5. Excel Doughnut Chart
- Select range B4:B16 and press Ctrl.
- Also select range D4:D16.
- Insert a doughnut chart from the pie chart icon drop-down.
- The doughnut chart will show the sales quantity of laptops.
Advanced Data Visualization
1. Excel Waterfall Chart
A waterfall chart illustrates how positive and negative values contribute to a total. Although our example has only positive values, waterfall charts are useful for analyzing financial data over time.
- Select range B4:B16 (Ctrl + E4:E16).
- Insert a waterfall chart from the drop-down menu.
- Customize the chart title.
You will see only an increasing trend in this example because there are no negative values. This chart is great for analyzing financial data over a period of time.
2. Animated Column Chart
An animated column chart dynamically displays data with animation effects. Unlike static charts, the elements in an animated chart change or update over time, making it easier to understand trends and data patterns.
To create an animated column chart:
- Open the Animated Chart worksheet.
- Go to the Developer tab and select Visual Basic.
- Insert a new Module.
- Paste the following VBA code into your Macro Editor:
Sub Animated_Column_Chart()
'variable declaration
Dim WS As Worksheet
Dim delay_time As String
Dim nRow As Long, nCol As Long
Dim myRng As Range
Dim myArr() As Variant
'screen update ON
Application.ScreenUpdating = True
'set values of variables
Set WS = Sheets("Animated Chart")
Set myRng = WS.Range("C5:E16")
nRow = myRng.Rows.Count
nCol = myRng.Columns.Count
delay_time = "00:00:01"
'array resize
ReDim myArr(1 To nRow, 1 To nCol)
'take values from range of cells to an array
For i = 1 To nRow
For j = 1 To nCol
myArr(i, j) = myRng.Cells(i, j)
Next j
Next i
'clear data range
myRng.Value = nbNullString
'define range of selection
WS.Range("G4").Select
ActiveSheet.Shapes.AddChart.Select
'set the data source
ActiveChart.SetSourceData Source:=WS.Range("$B$4:$E$16")
'declare the chart type
ActiveChart.ChartType = xlColumnClustered
'declare activation command
ActiveSheet.ChartObjects(1).Activate
ActiveSheet.ChartObjects(1).Cut
'select the chart destination
WS.Select
ActiveSheet.Paste
'return to the worksheet
WS.Select
Range("B5").Activate
'show values in range and create animated column chart race
For i = 1 To nRow
For j = 1 To nCol
myRng.Cells(i, j) = myArr(i, j)
DoEvents
Next j
DoEvents
Application.Wait (Now + TimeValue(delay_time))
Next i
End Sub
- Press the Run button or use the F5 key to execute the code.
- You will see the animated column chart in the Animated Chart worksheet.
VBA Breakdown
For i = 1 To nRow
For j = 1 To nCol
myArr(i, j) = myRng.Cells(i, j)
Next j
Next i
- These nested loops go through and copy values from myRng range to myArr array, one by one.
WS.Range("G4").Select
ActiveSheet.Shapes.AddChart.Select
- These lines select cell G4 in the Animated Chart worksheet and add a new chart to the sheet.
ActiveChart.ChartType = xlColumnClustered
- This line sets the chart type to a clustered column chart.
ActiveSheet.ChartObjects(1).Activate
ActiveSheet.ChartObjects(1).Cut
WS.Select
ActiveSheet.Paste
- These lines activate the chart object, cut, and paste it into the Animated Chart worksheet.
For i = 1 To nRow
For j = 1 To nCol
myRng.Cells(i, j) = myArr(i, j)
DoEvents
Next j
DoEvents
Application.Wait (Now + TimeValue(delay_time))
Next i
- These nested loops iterate through each cell in the myRng range and set its value from the myArr array. After setting each value, the DoEvents function allows the system to process events and update the screen to show the changes. Then, there is a time delay specified by the delay_time variable before moving to the next row. This is how animated chart effect is created.
Download Practice Workbook
You can download the practice workbook from here:
Things to Remember
- Remember these tips when working with data analysis and visualization in Excel:
- Use Data Validation instead of manual typing to ensure accurate data entry.
- Refresh your pivot table whenever you update your dataset.
- Be cautious when selecting input and output ranges while using the Data Analysis add-in.
Frequently Asked Questions
- Why use ANOVA?
- ANOVA (Analysis of Variance) is valuable for comparing trend differences between multiple datasets. It reveals variations in averages among different groups, assesses the impact of categorical variables, and validates dataset reliability.
- Difference between inferential and descriptive statistics:
- Inferential statistics predict outcomes for a larger group based on a smaller dataset. Descriptive statistics summarize various parameters of a dataset.
- Advantages of the Analyze Data feature:
- The Analyze Data feature offers useful functions, customizable capabilities, a user-friendly interface, and automated calculations. It surpasses manual data analysis techniques in flexibility and efficiency.
Get FREE Advanced Excel Exercises with Solutions!