[Data Visualization] Why - Task Abstraction

7 분 소요

Review: The Big Picture

What data does the user see? (data abstraction)
Why does the user use the system? (task abstraction)
How are the visual encoding and interaction idioms constructed? (idiom abstraction)

The Big Picture

What data does the user see? (data abstraction)
Why does the user use the system? (task abstraction)
How are the visual encoding and interaction idioms constructed? (idiom abstraction)

Why - Task Abstraction

Why data abstraction?
- There are an infinite number of datasets.
- Thus, it would be inefficient to design a visualization system for every dataset.
Why task abstraction?
- There are an infinite number of tasks that users want to perform on the system.
- Thus, it would be inefficient to design a visualization system for every task.

Consider tasks in abstract form , rather than domain specific way
- Otherwise, hard to make useful comparisons between domain situations
Actually, below are two instances of “compare values between two

The analysis framework has a small set of carefully chosen words to describe why people are using vis
- Action: analyze, search, query, $\dots$
- Targets: trends, outliers, distribution, $\dots$
The same vis tool might be usable for many different goals.
- To describe complex activities, you can specify a chained sequence of tasks, where the output of one becomes the input to the next.

Who : Designer or User

Although who is not a part of the what why how framework, it is sometimes useful to specify who has a goal or makes a design choice.

Actions

Three levels of actions:
High level choice: Analyze
- Q) How is the vis tool used to analyze data?
- A) Consume existing data or produce additional data.
Mid level choice: Search
- Q) What kind of search is involved?
- A) Lookup, browse, locate, or explore
Low level choice: Query
- Q) Does the user need to identify one target?
Choices at the three levels are independent.
- Usually, we describe actions at all three levels.

High level Choice: Analyze

Why are possible goals of users who want to analyze data using a vis tool?
Consume : The most common case for vis is for the user to consume information that has already been generated as data stored in a format amenable to computation.
- This is the most common “why”.
Produce : However, sometimes, we use vis to produce new materials! We will see examples later.

High level Choice: Analyze - Consume

Three consume goals:
Discover (= explore) explore): to find new knowledge that was not previously known
Present (= explain): to communicate with others about the knowledge that is known
Enjoy : visualization in casual encounters, e.g., infographic

High level Choice: Analyze - Consume - Discover

The discover goal refers to using vis to find new knowledge that was not previously known.
You want to find some insights from an unseen dataset. What will you do?
- Usually, investigation is driven by existing theories, models, hypotheses, or hunches.
Generate a new hypothesis or verify , or disconfirm, an existing hypothesis.

High level Choice: Analyze - Consume - Discover Example

You have a periodic table data.

What can you discover?
You may want to explore the data using a vis tool.

Plot a scatterplot melting point vs first ionization energy

The distribution was bell shaped with an outlier (Carbon).

Plot a scatterplot melting point vs boiling point

My hypothesis was (boiling point) > (melting point), but there were a few outliers (e.g., Californium).

High level Choice: Analyze - Consume - Present

The present goal refers to the use of vis for the succinct communication of information (= explain).
- e.g., telling a story with data, or guiding an audience through a series of cognitive operations.
One classic example: a diagram in a newspaper
The knowledge communicated is already known to the presenter in advance.

High level Choice: Analyze - Consume - Present Example

High level Choice: Analyze - Consume - Enjoy

The enjoy goal refers to casual encounters with vis.
- Vis for fun!
Sometimes, the goals of the eventual vis user might not be a match with the user goals conjectured by the vis designer!

High level Choice: Analyze - Consume - Enjoy Example

Top 15 Best Global Brands Ranking (2000 2018)
- Source: https://www.youtube.com/watch?v=BQovQUga0VE&ab_channel=TheRankings

High level Choice: Analyze - Produce

In the produce goal, the intent of the user is to generate new material.
- Sometimes, the user intends to use the new material for some other vis related tasks, such as discovery presentation.
Annotate (~tag): adding graphical or textual annotations associated with one or more visualization elements
Record : saving or capturing visualization elements as persistent artifacts
Derive (= transform): producing new data elements based on existing data elements.

High level Choice: Analyze - Produce - Annotate & Record

The difference between annotate and record
- The annotate choice attaches information temporality (can be subsequently recorded)
- The record choice saves a persistent artifact (e.g., screen shots, videos, etc.)
- But, it seems that these two are interchangeable in most contexts.

High level Choice: Analyze - Produce - Derive

The derive goal is to produce new data elements based on existing data elements.
- What can be derived? an attribute or a new dataset
Let’s recall the InfoVis Reference Model

How would you derive a new attribute (creating derived attributes)?
- By changing the type of an attribute (can lose some information)
  - Grade (O) to score (Q): A+ 98, A0 93, B+ 88, …
  - Temperature (Q) to category (N): 30 hot, 20 warm, 0 cold
- By augmenting external data (adding information)
  - City name (N) to latitude (Q) and longitude (Q)
- By applying mathematical operations
  - Computing the difference between two attributes
  - Log scale
  - Min-max normalization
  - One hot vector encoding

High level Choice: Analyze - Produce - Derive Example

#### High level Choice: Analyze - Produce - Derive

Sometimes, we derive an entirely new dataset.
- e.g., reshaping operations in pandas

Sometimes, we derive an entirely new dataset.
- e.g., group by operations in pandas

Another example of deriving a new dataset is to change the dataset type.
- e.g., building K Nearest Neighbor (KNN) Graph (a table to a network)

Mid-level Choice: Search

All of the high level analyze cases require users to search for elements of interest within the vis as the mid level goal.
Search can be classified into four cases depending on
- 1. whether the identity of the search target is known or not and
- 1. whether the location of the search target is known or not.

Lookup : looking up human (target) knowing that it belongs to mammals ( location
Locate : locating rabbits target ) not knowing where it belongs to
- Commonly, we call just this specific task a “search”.
- e.g., search “abc.txt” on your disk
Browse : browsing all leaves of the mammal subtree location
Explore : exploring for a family having the largest number of species
- Note: “explore” was a synonym of “discover”.

Low-level Choice: Query

After searching, you will find a target or set of targets.
Then, you may want to investigate the targets by querying some information.
Identify : returns the characteristic of a single target
Compare : returns the characteristics of multiple targets
Summarize (= overview): returns a comprehensive view of everything
- Extremely common in vis systems as a startup view!

Low-level Choice: Query Example

Identify : identifying the election result of one state
Compare : comparing the election result of one state to another
Summarize : summarizing the election results cross all states to determine how many favored one candidate

Targets

So far, we have learned actions (verbs) that users want to perform on vis.
Targets (nouns) mean some aspect of data that is of interest to users.
What does your vis do?
- Action + Target
- Discover Trends
- Present Distribution
- Compare Topology

All data level
- Trends : a high level characterization of data
  - e.g., increase, peaks, troughs, plateaus
- Outliers : elements do not fit well with trends
- Features : particular structures of interest
  - e.g., clique in graph theory

Attribute level
- Distribution : the distribution of study time
- Extremes : the student who study longest per week
- Dependency : does grade depend on study time
  - In general, such a causal relationship is really hard to prove!
- Correlation : are grade and study time positively correlated?
- Similarity : similarity between students in terms of study time grade

Network datasets: topology and paths
Spatial datasets: shapes
The last two pertain to specific types of datasets.

So, How?

So far, we have learned the terms for describing what to be visualized (data abstraction) and why we visualize (task abstraction)
Then, how do we visualize data?
- “visualization
There are many useful idioms and studying those idioms is the main goal of this course!
Let me give you a brief overview first…

How - Idiom Abstraction

Visualization Analysis Example

SpaceTree vs. TreeJuxtaposer
SpaceTree : https://www.youtube.com/watch?v=B4vuSLVCJtw&t=112s&ab_channel=HCILUMD
TreeJuxtaposer : https://www.youtube.com/watch?v=eIK3ItXyMi0&ab_channel=HCILUMD

Deriving Attribute Example

When a tree is too big, we need a way to summarize the tree.
The Strahler Number : a measure for node importance
Original: 500,000 nodes, Simplified: 5,000 nodes
Can be described by a chained sequence of two instances

Summary: Task Abstraction

Twitter Facebook LinkedIn

LEE CHANWOO

[Data Visualization] Why - Task Abstraction

Review: The Big Picture

The Big Picture

Why - Task Abstraction

Who : Designer or User

Actions

High level Choice: Analyze

High level Choice: Analyze - Consume

High level Choice: Analyze - Consume - Discover

High level Choice: Analyze - Consume - Discover Example

High level Choice: Analyze - Consume - Present

High level Choice: Analyze - Consume - Present Example

High level Choice: Analyze - Consume - Enjoy

High level Choice: Analyze - Consume - Enjoy Example

High level Choice: Analyze - Produce

High level Choice: Analyze - Produce - Annotate & Record

High level Choice: Analyze - Produce - Derive

High level Choice: Analyze - Produce - Derive Example

#### High level Choice: Analyze - Produce - Derive

Mid-level Choice: Search

Low-level Choice: Query

Low-level Choice: Query Example

Targets

So, How?

How - Idiom Abstraction

Visualization Analysis Example

Deriving Attribute Example

Summary: Task Abstraction

공유하기

댓글남기기

참고

[Programming] gRPC란? gRPC와 REST의 차이점

[Python] uv : 패키지 관리 도구

[Python] PEP 8 : Style Guide for Python Code

[Python] PEP 20 : The Zen of Python