Where's My Data? Evaluating Visualizations with Missing Data
(0.98Mb) Hayeong Song & Danielle Albers Szafir. Where's My Data? Evaluating Visualizations with Missing Data. IEEE Transactions on Visualization and Computer Graphics, 25(1), 2019 (to appear). [Published as part of the Proceedings of IEEE VIS 2018]
Abstract: Many real-world datasets are incomplete due to factors such as data collection failures or misalignments between fused datasets. Visualizations of incomplete datasets should allow analysts to draw conclusions from their data while effectively reasoning about the quality of the data and resulting conclusions. We conducted a pair of crowdsourced studies to measure how the methods used to impute and visualize missing data may influence analysts' perceptions of data quality and their confidence in their conclusions. Our experiments used different design choices for line graphs and bar charts to estimate averages and trends in incomplete time series datasets. Our results provide preliminary guidance for visualization designers to consider when working with incomplete data in different domains and scenarios..
Below are the supplemental materials for "Where's My Data? Evaluating Visualizations with Missing Data." The data captures subjective and objective participants responses to trend and average estimation tasks using two common visualization types: line graphs and bar charts. The data and infrastructures can be used to replicate and extend the models presented in the paper. See the paper for additional details.
Line Graph Data:
- Anonymized Responses for Average Estimation, measuring participant perceptions of confidence, credibility, reliability, and completeness for each combination of visualization method, imputation method, and amount of missing data.
- Anonymized Responses for Trend Estimation, measuring participant perceptions of confidence, credibility, reliability, and completeness for each combination of visualization method, imputation method, and amount of missing data.
- Averaging Infrastructure, the web harness used to collect the averaging data.
- Trend Infrastructure, the web harness used to collect the trend estimation data.
Bar Chart Data:
- Anonymized Responses for Average Estimation, measuring participant perceptions of confidence, credibility, reliability, and completeness for each combination of visualization method, imputation method, and amount of missing data.
- Anonymized Responses for Trend Estimation, measuring participant perceptions of confidence, credibility, reliability, and completeness for each combination of visualization method, imputation method, and amount of missing data.
- Averaging Infrastructure, the web harness used to collect the averaging data.
- Trend Infrastructure, the web harness used to collect the trend estimation data.
Experimental Data: