Refining a visual often involves getting rid of unnecessary elements and directing a viewer's attention to specific elements. The process to enhance clearity, readability, and effectiveness is a iterative process that requires constant self-evaluations. This process requires many hours, although the output may look simple.

Can Large Language Models (LLMs) be used as a tool for generating visuals? While LLMs excel at text-based tasks, their ability to understand and generate complex concepts can be leveraged to assist in writing code to create visuals. But can LLMs be used to refine visuals? This post tests LLM's ability to refine Plotly visuals.

1. Asking Gemini to create a Plotly Chart

Gemini Logo

🔮 Prompt

Generate a Plotly Python code to compare the stock prices of Adidas and Nike for the past two years.

❇️ Gemini Response

# Python
import yfinance as yf
import plotly.graph_objects as go

# Get stock data for Adidas and Nike
adidas = yf.download('ADDYY', start='2022-09-28', end='2024-09-28')
nike = yf.download('NKE', start='2022-09-28', end='2024-09-28')

# Create a figure
fig = go.Figure()

# Add traces for Adidas and Nike
fig.add_trace(go.Scatter(x=adidas.index, y=adidas['Close'], name='Adidas'))
fig.add_trace(go.Scatter(x=nike.index, y=nike['Close'], name='Nike'))

# Update layout
fig.update_layout(
    title='Adidas vs. Nike Stock Price Comparison',
    xaxis_title='Date',
    yaxis_title='Stock Price'
)

# Show the plot
fig.show()

Explanation:

  1. Import necessary libraries: yfinance for downloading stock data and plotly.graph_objects for creating the plot.
  2. Get stock data: Use yf.download() to fetch the closing prices of Adidas (ADDYY) and Nike (NKE) from September 28, 2022, to September 28, 2024.
  3. Create a figure: Initialize a go.Figure object to hold the plot.
  4. Add traces: Add two scatter plots to the figure, one for Adidas and one for Nike. The x values are the dates, and the y values are the closing prices.
  5. Update layout: Customize the plot's title, x-axis label, and y-axis label.
  6. Show the plot: Display the created plot.

Note: Replace '2022-09-28' and '2024-09-28' with the desired start and end dates if you want to analyze a different time period.

This code will generate an interactive line chart comparing the stock prices of Adidas and Nike over the specified time frame. You can customize the plot further by adding annotations, changing colors, or adjusting other layout properties.

👇 Run the code without any modifications.

In [1]:
import yfinance as yf
import plotly.graph_objects as go

# Get stock data for Adidas and Nike
adidas = yf.download('ADDYY', start='2022-09-28', end='2024-09-28')
nike = yf.download('NKE', start='2022-09-28', end='2024-09-28')

# Create a figure
fig = go.Figure()

# Add traces for Adidas and Nike
fig.add_trace(go.Scatter(x=adidas.index, y=adidas['Close'], name='Adidas'))
fig.add_trace(go.Scatter(x=nike.index, y=nike['Close'], name='Nike'))

# Update layout
fig.update_layout(
    title='Adidas vs. Nike Stock Price Comparison',
    xaxis_title='Date',
    yaxis_title='Stock Price'
)

# Show the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

🔮 Prompt

What version of Gemini are you?

❇️ Gemini Response

I am Gemini 1.5 Flash.

✒️ Thoughts

This is impressive. Gemini generated a complete and functional Python code that utilizes Plotly's graph_objects module. While graph_objects offers fine-grained control over visualizations, it requires writing more code compared to simpler alternatives. Notably, the code retrieves stock price data from yfinance, a library that can access data from Yahoo Finance without requiring an API key.

🔮 Prompt asking Gemini to refine the visual without additional instructions

Can you refine the previous Plotly chart to be cleaner? 

❇️ Gemini Response

Here's a refined version of the code, incorporating suggestions for a cleaner and more visually appealing chart:
In [2]:
import pandas as pd
import yfinance as yf
import plotly.graph_objects as go

# Get stock data for Adidas and Nike
adidas = yf.download('ADDYY', start='2022-09-29', end='2024-09-29')
nike = yf.download('NKE', start='2022-09-29', end='2024-09-29')

# Create a figure with a clean layout
fig = go.Figure(layout=go.Layout(
    title='Adidas vs. Nike Stock Price Comparison',
    xaxis_title='Date',
    yaxis_title='Stock Price',
    font=dict(family='Arial', size=12),
    margin=dict(l=50, r=50, b=50, t=50)
))

# Add traces with custom colors and line styles
fig.add_trace(go.Scatter(x=adidas.index, y=adidas['Close'], name='Adidas', line=dict(color='blue', width=2)))
fig.add_trace(go.Scatter(x=nike.index, y=nike['Close'], name='Nike', line=dict(color='red', width=2)))

# Show the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

🔮 Updated prompt to refine the visual with specific instructions

Generate a Plotly Python code to compare the stock prices of Adidas and Nike for the past two years.

- Use Plotly's simple_white template.
- Make Adidas' line color #0051BA.
- Make Nike's line color #EA553B.
- Set the figure's height to 500px.
- Set Adidas' line width to 2px and Nike's line width to 1px.

❇️ Gemini Response

In [3]:
import yfinance as yf
import plotly.graph_objects as go

# Get stock data for Adidas and Nike
adidas = yf.download('ADDYY', start='2022-09-28', end='2024-09-28')
nike = yf.download('NKE', start='2022-09-28', end='2024-09-28')

# Create a figure
fig = go.Figure(layout=go.Layout(template='simple_white'))

# Add traces for Adidas and Nike
fig.add_trace(go.Scatter(x=adidas.index, y=adidas['Close'], name='Adidas', line=dict(color='#0051BA', width=2)))
fig.add_trace(go.Scatter(x=nike.index, y=nike['Close'], name='Nike', line=dict(color='#EA553B', width=1)))

# Update layout
fig.update_layout(
    title='Adidas vs. Nike Stock Price Comparison',
    xaxis_title='Date',
    yaxis_title='Stock Price',
    height=500
)

# Show the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

✒️ Thoughts

While asking Gemini to refine a visual didn't produce optimal results, providing specific instructions that align with Plotly's API yielded a more effective outcome.

2. Asking ChatGPT to create a Plotly Chart

ChatGPT Logo

🔮 Prompt

Generate a Plotly Python code to compare the stock prices of Adidas and Nike for the past two years.

❇️ ChatGPT Response

You can use Plotly along with a library like yfinance to retrieve the stock data for Adidas (ADS.DE) and Nike (NKE), and then plot it over the last two years.

Here is the Python code that generates a comparison chart of Adidas and Nike stock prices using Plotly:

# Python
import yfinance as yf
import plotly.graph_objects as go
from datetime import datetime, timedelta

# Define the stock symbols for Adidas (ADS.DE) and Nike (NKE)
adidas_symbol = 'ADS.DE'
nike_symbol = 'NKE'

# Define the time range: last two years from today
end_date = datetime.now()
start_date = end_date - timedelta(days=730)

# Fetch historical stock data for Adidas and Nike
adidas_data = yf.download(adidas_symbol, start=start_date, end=end_date)
nike_data = yf.download(nike_symbol, start=start_date, end=end_date)

# Create a plotly figure
fig = go.Figure()

# Add Adidas stock prices to the plot
fig.add_trace(go.Scatter(x=adidas_data.index, y=adidas_data['Close'], mode='lines', name='Adidas'))

# Add Nike stock prices to the plot
fig.add_trace(go.Scatter(x=nike_data.index, y=nike_data['Close'], mode='lines', name='Nike'))

# Customize the layout of the plot
fig.update_layout(
    title='Adidas vs Nike Stock Prices (Last Two Years)',
    xaxis_title='Date',
    yaxis_title='Stock Price (USD)',
    hovermode='x unified',
    legend_title='Company',
    template='plotly_dark'
)

# Display the plot
fig.show()

Key Steps:

  1. yfinance is used to fetch the stock price data for Adidas (ADS.DE) and Nike (NKE).
  2. The date range is set to the last two years using timedelta.
  3. Plotly is used to create a line chart comparing the stock prices of both companies.

Make sure you have the necessary libraries installed:

pip install yfinance plotly

👇 Run the code without any modifications.

In [4]:
import yfinance as yf
import plotly.graph_objects as go
from datetime import datetime, timedelta

# Define the stock symbols for Adidas (ADS.DE) and Nike (NKE)
adidas_symbol = 'ADS.DE'
nike_symbol = 'NKE'

# Define the time range: last two years from today
end_date = datetime.now()
start_date = end_date - timedelta(days=730)

# Fetch historical stock data for Adidas and Nike
adidas_data = yf.download(adidas_symbol, start=start_date, end=end_date)
nike_data = yf.download(nike_symbol, start=start_date, end=end_date)

# Create a plotly figure
fig = go.Figure()

# Add Adidas stock prices to the plot
fig.add_trace(go.Scatter(x=adidas_data.index, y=adidas_data['Close'], mode='lines', name='Adidas'))

# Add Nike stock prices to the plot
fig.add_trace(go.Scatter(x=nike_data.index, y=nike_data['Close'], mode='lines', name='Nike'))

# Customize the layout of the plot
fig.update_layout(
    title='Adidas vs Nike Stock Prices (Last Two Years)',
    xaxis_title='Date',
    yaxis_title='Stock Price (USD)',
    hovermode='x unified',
    legend_title='Company',
    template='plotly_dark'
)

# Display the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

🔮 Prompt

What version of GPT are you?

❇️ ChatGPT Response

I am based on OpenAI's GPT-4 architecture, which is the latest version as of my knowledge cutoff in 2023. Specifically, you're interacting with a version that has enhanced capabilities in areas like reasoning, creativity, and following instructions compared to previous models.

✒️ Thoughts

This is impressive as well. Similar to Gemini, ChatGPT generated a complete and functional Python code that utilizes Plotly's graph_objects module. However, two differences stand out: (1) ChatGPT uses the "plotly_dark" template, while Gemini did not specify a template, and (2) ChatGPT uses the ADS.DE ticker, whereas Gemini uses ADDYY. To clarify, ADS.DE is for European (German) investors trading in euros, while ADDYY is for American investors trading in U.S. dollars via ADRs.

  1. ADS.DE: This is the ticker for Adidas AG on the Frankfurt Stock Exchange in Germany, where the company is headquartered. The ".DE" suffix indicates that the stock is traded on a German exchange. The stock is priced in euros (EUR).

  2. ADDYY: This is the ticker for Adidas's American Depositary Receipts (ADRs), which are traded on the OTC (Over-the-Counter) market in the United States. ADRs allow U.S. investors to invest in non-U.S. companies without directly dealing with foreign exchanges. Each ADR for Adidas typically represents a fraction of a share (in Adidas’s case, usually one-half of a share). The price of ADDYY is quoted in U.S. dollars (USD).

A few more observations:

  • Technically, ChatGPT's code is incorrect. The closing prices for ADS.DE are in euros, but the y-axis is labeled as USD.
  • ChatGPT uses variables to store the symbols for Nike and Adidas. This improves readability compared to Gemini, which hard-codes the tickers directly inside the yf.download() method calls.
  • ChatGPT uses datetime.timedelta() to find the start of the stock price date range. This is a more programmatic approach than hard-coding the start date.
  • ChatGPT unifies the hover text on the x-axis, a subtle detail that lets viewers compare the exact stock prices for a given date.
  • Although ChatGPT made a mistake by treating euros as U.S. dollars, it still writes more viewer-friendly code overall.

🔮 Updated prompt to refine the visual

Generate a Plotly Python code to compare the stock prices of Adidas and Nike for the past two years.

- Use Plotly's simple_white template.
- Make Adidas' line color #0051BA.
- Make Nike's line color #EA553B.
- Set the figure's height to 500px.
- Set Adidas' line width to 2px and Nike's line width to 1px.

❇️ ChatGPT Response

Here’s the Python code that compares the stock prices of Adidas and Nike over the past two years, using Plotly's "simple_white" template with the specified line colors and widths, and setting the figure height to 500px:
In [5]:
import yfinance as yf
import plotly.graph_objects as go
from datetime import datetime, timedelta

# Define the stock symbols for Adidas (ADS.DE) and Nike (NKE)
adidas_symbol = 'ADS.DE'
nike_symbol = 'NKE'

# Define the time range: last two years from today
end_date = datetime.now()
start_date = end_date - timedelta(days=730)

# Fetch historical stock data for Adidas and Nike
adidas_data = yf.download(adidas_symbol, start=start_date, end=end_date)
nike_data = yf.download(nike_symbol, start=start_date, end=end_date)

# Create a plotly figure
fig = go.Figure()

# Add Adidas stock prices to the plot with specified color and line width
fig.add_trace(go.Scatter(
    x=adidas_data.index, 
    y=adidas_data['Close'], 
    mode='lines', 
    name='Adidas',
    line=dict(color='#0051BA', width=2)  # Adidas color with 2px line
))

# Add Nike stock prices to the plot with specified color and line width
fig.add_trace(go.Scatter(
    x=nike_data.index, 
    y=nike_data['Close'], 
    mode='lines', 
    name='Nike',
    line=dict(color='#EA553B', width=1)  # Nike color with 1px line
))

# Customize the layout of the plot
fig.update_layout(
    title='Adidas vs Nike Stock Prices (Last Two Years)',
    xaxis_title='Date',
    yaxis_title='Stock Price',
    hovermode='x unified',
    legend_title='Company',
    template='simple_white',  # Use simple_white template
    height=500  # Set figure height to 500px
)

# Display the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

✒️ Thoughts

Providing specific instructions to ChatGPT proved equally effective as it did for Gemini. How does ChatGPT's self-refinement ability compare to that of Gemini? To assess this, I created a new context and ran the "refining" prompt three times.

🔮 Prompt

Create a new ChatGPT context and repeat the first prompt.

Generate a Plotly Python code to compare the stock prices of Adidas and Nike for the past two years.

❇️ ChatGPT Response

In [6]:
import yfinance as yf
import plotly.graph_objs as go
from datetime import datetime, timedelta

# Fetch stock data for Adidas (ADDYY) and Nike (NKE) over the past two years
end_date = datetime.today().strftime('%Y-%m-%d')
start_date = (datetime.today() - timedelta(days=2*365)).strftime('%Y-%m-%d')

adidas = yf.download('ADDYY', start=start_date, end=end_date)
nike = yf.download('NKE', start=start_date, end=end_date)

# Create traces for Adidas and Nike stock prices
trace_adidas = go.Scatter(x=adidas.index, y=adidas['Close'], mode='lines', name='Adidas')
trace_nike = go.Scatter(x=nike.index, y=nike['Close'], mode='lines', name='Nike')

# Create the figure
fig = go.Figure()

# Add traces to the figure
fig.add_trace(trace_adidas)
fig.add_trace(trace_nike)

# Customize the layout
fig.update_layout(
    title='Stock Prices of Adidas vs Nike Over the Past Two Years',
    xaxis_title='Date',
    yaxis_title='Stock Price (USD)',
    legend_title='Companies',
    template='plotly',
)

# Display the figure
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

🔮 Prompt asking ChatGPT to refine the visual without additional instructions

Can you refine the previous Plotly chart to be cleaner? 

❇️ ChatGPT Response 1

To make the chart cleaner and more refined, here are a few adjustments:

-Smoother lines: Ensure smoother rendering of the stock price lines.
- Use hover mode: Provide detailed information on hover for better user interaction.
- Update axis formatting: Improve readability by formatting the dates on the x-axis and adding currency symbols on the y-axis.
- Add gridlines and markers: Enhance visualization with gridlines and markers at significant points.
In [7]:
import yfinance as yf
import plotly.graph_objs as go
from datetime import datetime, timedelta

# Fetch stock data for Adidas (ADDYY) and Nike (NKE) over the past two years
end_date = datetime.today().strftime('%Y-%m-%d')
start_date = (datetime.today() - timedelta(days=2*365)).strftime('%Y-%m-%d')

adidas = yf.download('ADDYY', start=start_date, end=end_date)
nike = yf.download('NKE', start=start_date, end=end_date)

# Create traces for Adidas and Nike stock prices
trace_adidas = go.Scatter(
    x=adidas.index, 
    y=adidas['Close'], 
    mode='lines+markers', 
    name='Adidas',
    line=dict(shape='spline', color='blue', width=2),
    marker=dict(size=4)
)

trace_nike = go.Scatter(
    x=nike.index, 
    y=nike['Close'], 
    mode='lines+markers', 
    name='Nike',
    line=dict(shape='spline', color='orange', width=2),
    marker=dict(size=4)
)

# Create the figure
fig = go.Figure()

# Add traces to the figure
fig.add_trace(trace_adidas)
fig.add_trace(trace_nike)

# Customize the layout for a cleaner appearance
fig.update_layout(
    title=dict(text='Adidas vs Nike Stock Prices (Past Two Years)', x=0.5),
    xaxis_title='Date',
    yaxis_title='Stock Price (USD)',
    legend_title='Companies',
    template='simple_white',
    hovermode='x',
    xaxis=dict(
        showgrid=True,
        showline=True,
        gridcolor='lightgray',
        linecolor='black',
        tickformat='%b %Y'  # Format the x-axis for better date readability
    ),
    yaxis=dict(
        showgrid=True,
        showline=True,
        gridcolor='lightgray',
        linecolor='black',
        tickprefix='$'  # Add dollar symbol to the y-axis
    ),
    legend=dict(
        x=0.05, y=0.95, 
        bordercolor="Black", 
        borderwidth=1
    ),
    plot_bgcolor='rgba(0, 0, 0, 0)',  # Transparent background
    margin=dict(l=40, r=40, t=60, b=40)
)

# Display the figure
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

❇️ ChatGPT Response 2

Only run the code.

In [8]:
import yfinance as yf
import plotly.graph_objs as go
from datetime import datetime, timedelta

# Define the stock symbols for Adidas and Nike
adidas_symbol = 'ADDYY'  # Adidas stock symbol in Yahoo Finance
nike_symbol = 'NKE'  # Nike stock symbol in Yahoo Finance

# Set the date range for the past two years
end_date = datetime.now()
start_date = end_date - timedelta(days=2*365)

# Download stock data from Yahoo Finance
adidas_data = yf.download(adidas_symbol, start=start_date, end=end_date)
nike_data = yf.download(nike_symbol, start=start_date, end=end_date)

# Create traces for Adidas and Nike
trace_adidas = go.Scatter(
    x=adidas_data.index, 
    y=adidas_data['Adj Close'],
    mode='lines', 
    name='Adidas',
    line=dict(color='royalblue', width=2),  # Clean, consistent line width and color
    hovertemplate='Adidas: $%{y:.2f}<extra></extra>'  # Clean hover label
)

trace_nike = go.Scatter(
    x=nike_data.index, 
    y=nike_data['Adj Close'],
    mode='lines', 
    name='Nike',
    line=dict(color='darkorange', width=2),  # Clean, consistent line width and color
    hovertemplate='Nike: $%{y:.2f}<extra></extra>'  # Clean hover label
)

# Create layout for the plot
layout = go.Layout(
    title={
        'text': 'Stock Prices of Adidas vs Nike (Past Two Years)',
        'x': 0.5,  # Center title
        'xanchor': 'center',
        'yanchor': 'top'
    },
    xaxis=dict(
        title='Date',
        tickformat='%b %Y',  # Display dates in a clean format (e.g., Jan 2022)
        showgrid=False  # Remove vertical gridlines for cleaner look
    ),
    yaxis=dict(
        title='Stock Price (USD)',
        tickprefix='$',  # Show dollar sign
        showgrid=True, 
        gridcolor='lightgray',  # Light grid for better focus on the data
    ),
    hovermode='x unified',  # Unifies hover across both traces
    plot_bgcolor='white',  # Set the background color to white for a clean look
    legend=dict(
        title='Legend',
        orientation="h",  # Horizontal legend at the bottom
        yanchor="bottom",
        y=1.02,
        xanchor="center",
        x=0.5
    ),
    margin=dict(l=40, r=40, t=50, b=40),  # Set margins to avoid unnecessary whitespace
)

# Create the figure
fig = go.Figure(data=[trace_adidas, trace_nike], layout=layout)

# Show the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

❇️ ChatGPT Response 3

Only run the code.

In [9]:
import yfinance as yf
import plotly.graph_objs as go
from datetime import datetime, timedelta

# Define the stock symbols for Adidas and Nike
adidas_symbol = 'ADDYY'
nike_symbol = 'NKE'

# Set the date range for the past two years
end_date = datetime.now()
start_date = end_date - timedelta(days=2*365)

# Download stock data from Yahoo Finance
adidas_data = yf.download(adidas_symbol, start=start_date, end=end_date)
nike_data = yf.download(nike_symbol, start=start_date, end=end_date)

# Create traces for Adidas and Nike
trace_adidas = go.Scatter(x=adidas_data.index, y=adidas_data['Adj Close'],
                          mode='lines', name='Adidas',
                          line=dict(color='blue', width=2))

trace_nike = go.Scatter(x=nike_data.index, y=nike_data['Adj Close'],
                        mode='lines', name='Nike',
                        line=dict(color='orange', width=2))

# Create a layout with refined design
layout = go.Layout(
    title={
        'text': "Stock Prices Comparison: Adidas vs. Nike<br><sub>Past Two Years</sub>",
        'y':0.9,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    xaxis=dict(
        title='Date',
        showgrid=True,
        gridcolor='lightgrey',
        zeroline=False
    ),
    yaxis=dict(
        title='Stock Price (USD)',
        showgrid=True,
        gridcolor='lightgrey',
        zeroline=False,
        tickformat='$,.2f'  # Format y-axis labels as currency
    ),
    legend=dict(
        y=1,
        x=1,
        xanchor='right',
        yanchor='top',
        bgcolor='rgba(255, 255, 255, 0.5)',
        bordercolor='lightgrey',
        borderwidth=1
    ),
    hovermode='x unified',  # Shows tooltips in unified format
    plot_bgcolor='white',   # Clean background
    margin=dict(l=50, r=50, t=100, b=50)
)

# Create the figure
fig = go.Figure(data=[trace_adidas, trace_nike], layout=layout)

# Show the plot
fig.show()
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

3. Automatically Generating Goals and Charts using LIDA

What is LIDA?

LIDA is a library for generating data visualizations and data-faithful infographics.

LIDA treats visualization generation as a multi-step process. LIDA's idea is that well-structured pipelines utilizing large language models (LLMs) and image generation models (IGMs) are effective for accomplishing visualization tasks. LIDA is a Python package for creating visualizations and infographics that are independent of specific grammars.

Here are LIDA's core capabilities:

  • Data Summarization
  • Automated EDA
  • Grammar-Agnostic Visualizations
  • Explanation
  • Self-Evaluation
  • Visualization Repair

For this test, I used a CSV dataset to manually replicate the refinement process later on.

LIDA Setup

Install the lida package.

pip install lida

Import the required packages.

In [10]:
from lida import Manager, TextGenerationConfig, llm 
from lida.utils import plot_raster 
import os
import pandas as pd

Read the OpenAI API Key from a text file and store it into the OPENAI_API_KEY variable.

In [11]:
OPENAI_API_KEY = open('OPENAI_API_KEY.txt', 'r').read()

# another option is to directly set the OpenAI API key
# be careful not to accidently publish your API Key
# OPENAI_API_KEY = "YOUR_OPENAI_API_KEY_GOES_HERE"

Summarize Data, Generate Goals

Lida offers a manager class that exposes core functionality of the LIDA system. Run the code below to initialize the Manager with openai as the backend.

In [12]:
lida = Manager(
    text_gen = llm(
        provider="openai",
        api_key=OPENAI_API_KEY
    )
)

Each manager method takes a textgen_config argument, which is a dictionary that can be used to configure the text generation process (with parameters for the model, temperature, max_tokens, etc.). One of the keys in this dictionary is use_cache. If set to True, the manager will cache the generated text associated with that method, allowing for speedup and avoiding API limits.

Configure LIDA to use the "gpt-4-turbo" model. A major weakness I've noticed with LIDA is that it will throw an error while generating a visual if the prompt is too complicated for the GPT model to handle. The GPT-4 mini is tempting due to its smaller size (lower latency) and lower cost but tends to throw more errors.

Set the temperature to 0.2. LLM temperature is a parameter that controls the level of randomness and creativity in the output of a Large Language Model (LLM). It's often set between 0 and 1 but can sometimes be higher. The lower the temperature, the more deterministic and repetitive the model's output will be. For the purpose of this notebook, I'm using a low temperature to replicate the results.

In [13]:
textgen_config = TextGenerationConfig(
    n=1,
    temperature=0.2,
    model="gpt-4-turbo",
    use_cache=False
)

Create a summary object that can be used to generate goals.

In [14]:
summary = lida.summarize(
    "nike-adidas-stock-price-2022-2023.csv",
    summary_method="default",
    textgen_config=textgen_config
)

Create goals.

In [15]:
goals = lida.goals(
    summary,
    n=2,
    textgen_config=textgen_config
)

for goal in goals:
    display(goal)

Goal 0


Question: How do the stock prices of Adidas (ADS) and Nike (NKE) compare over the period from September 2022 to August 2023?

Visualization: Line chart with 'Date' on the x-axis and both 'ADS' and 'NKE' on the y-axis.

Rationale: Using a line chart to plot 'ADS' and 'NKE' against 'Date' allows us to visually compare the stock price trends of Adidas and Nike over the specified period. This visualization will help identify patterns, trends, and any correlation between the stock movements of the two companies.

Goal 1


Question: What is the volatility of Adidas and Nike stock prices during the year 2022-2023?

Visualization: Box plot for 'ADS' and 'NKE'.

Rationale: A box plot for each stock ('ADS' and 'NKE') will provide a summary of the distribution and volatility of the stock prices over the year. It will show the median, quartiles, and outliers, which are crucial for understanding the risk and stability associated with each stock.

Generate Visualizations

In [16]:
i = 0
library = "plotly"
charts = lida.visualize(
    summary=summary,
    goal=goals[i],
    textgen_config=textgen_config,
    library=library
)  

display(charts[0])

Refine the visual using human language

In [17]:
code = charts[0].code

instructions = [
    "use a simple_white background template",
    "get rid of gridliens",
    "change Adidas's line color to navy",
    "change Nike's line color to red",
    "change Nike's line style to dotted",
    "add a black solid vertical line on 2022-11-03",
    "add annotation on 2022-11-03 with the text \"Adidas' FIFA campaign: Messi and others in augmented reality\"",
    "add a black solid vertical line on 2023-07-15",
    "add annotation on 2022-07-15 with the text \"Messi announces MLS deal with Miami\""
]

edited_charts = lida.edit(
    code=code,
    summary=summary,
    instructions=instructions,
    library=library,
    textgen_config=textgen_config
)

edited_charts[0]
Out[17]:

✒️ Thoughts

Both the original and edited visuals appeared unnatural. Also, using LIDA to refine the visual was time-consuming due to its lengthy execution times.

View the code generated by LIDA and GPT

Print the visualization code generated by LIDA and GPT.

In [18]:
print(edited_charts[0].code)
import plotly.express as px
import pandas as pd

def plot(data: pd.DataFrame):
    data['Date'] = pd.to_datetime(data['Date'], errors='coerce')
    data = data[pd.notna(data['Date'])]
    
    fig = px.line(data, x='Date', y=['ADS', 'NKE'], labels={'value': 'Stock Price', 'variable': 'Company'}, title='Comparison of Adidas (ADS) and Nike (NKE) Stock Prices from September 2022 to August 2023')
    fig.update_traces(line=dict(color='navy'), selector=dict(name='ADS'))
    fig.update_traces(line=dict(color='red', dash='dot'), selector=dict(name='NKE'))
    fig.update_xaxes(tickangle=45, showgrid=False)
    fig.update_yaxes(showgrid=False)
    fig.update_layout(template='plotly_white', legend_title_text='Company')
    fig.add_vline(x='2022-11-03', line=dict(color='black', width=2))
    fig.add_annotation(x='2022-11-03', y=0.95, yref="paper", text="Adidas' FIFA campaign: Messi and others in augmented reality", showarrow=True, arrowhead=1, ax=20)
    fig.add_vline(x='2023-07-15', line=dict(color='black', width=2))
    fig.add_annotation(x='2023-07-15', y=0.05, yref="paper", text="Messi announces MLS deal with Miami", showarrow=True, arrowhead=1, ax=20)

    return fig

chart = plot(data)

Explain the underlying code of the edited chart

In [19]:
code = edited_charts[0].code

explanations = lida.explain(
    code=code,
    library=library,
    textgen_config=textgen_config
) 

for row in explanations[0]:
    print(row["section"]," ** ", row["explanation"])
accessibility  **  The chart is a line graph comparing the stock prices of Adidas (ADS) and Nike (NKE) from September 2022 to August 2023. It uses navy color for Adidas and a red dotted line for Nike, enhancing visual distinction. The x-axis represents dates, angled for better readability, and the y-axis represents stock prices, with both axes having a clean appearance without grid lines. The chart includes annotations for significant events, providing insights into potential impacts on stock prices.
transformation  **  This section of the code transforms the 'Date' column in the DataFrame to datetime objects, ensuring that all dates are in a consistent format. It also filters out any rows where the 'Date' is not a valid datetime object, thus cleaning the data for accurate plotting.
visualization  **  This section creates the line chart using Plotly Express, setting 'Date' as the x-axis and the stock prices of 'ADS' and 'NKE' as the y-axis. It customizes the appearance of the lines, axes, and layout to enhance readability and aesthetics. Vertical lines and annotations are added to highlight specific events, providing context that may explain fluctuations in the stock prices.

Create visualizations using user queries

Using a custom natural language query is possible. Here are two examples.

In [20]:
user_query = "What are the statistical distributions of the stock prices of Nike and Adidas?"

charts = lida.visualize(
    summary=summary,
    goal=user_query,
    textgen_config=textgen_config,
    library="plotly"
)  

charts[0]
Out[20]:
In [21]:
user_query = "What are the statistical distributions of the stock prices of Nike and Adidas? \
    Use a box plot to display the distributions."

charts = lida.visualize(
    summary=summary,
    goal=user_query,
    textgen_config=textgen_config,
    library="plotly",
)  

charts[0]
Out[21]:

4. Manually Creating a Chart

Finally, create a chart manually based on the generated codes from the previous sections.

Import packages and read data

In [22]:
import plotly.express as px
from datetime import timedelta
In [23]:
df_stocks = pd.read_csv('nike-adidas-stock-price-2022-2023.csv')
df_stocks.head()
Out[23]:
DateADSNKE
02022-09-07141.385361107.127426
12022-09-08141.763779107.374313
22022-09-09144.751221109.586372
32022-09-12148.595047110.978798
42022-09-13142.301514104.401840
In [24]:
df_stocks.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 251 entries, 0 to 250
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Date    251 non-null    object 
 1   ADS     251 non-null    float64
 2   NKE     251 non-null    float64
dtypes: float64(2), object(1)
memory usage: 6.0+ KB

Change the 'Date' column's data type to datetime64.

In [25]:
df_stocks['Date'] = pd.to_datetime(df_stocks['Date'])
In [26]:
fig = px.line(
    df_stocks,
    x='Date',
    y=['ADS', 'NKE'],
    labels={'value': 'Stock Price', 'variable': 'Company'},
    title='<b>Stock Prices of Adidas and Nike (2022-2023)</b><br>\
<span style="color: #aaa">Did Messi help Adidas widen the gap in stock performance?</span>',
    height=550,
    template='simple_white',
)

fig.update_xaxes(showline=True, showgrid=False, zeroline=False, ticks="outside")
fig.update_yaxes(showline=True, showgrid=False, zeroline=False, ticks="outside")

fig.update_traces(
    line=dict(color='#0051ba', width=2),
    selector=dict(name='ADS')
)

fig.update_traces(
    line=dict(color='#EA553B', width=1),
    selector=dict(name='NKE')
)

fig.update_layout(
    font_family='Helvetica, Inter, Arial, sans-serif',
    xaxis_range=[
        df_stocks['Date'].min() - pd.to_timedelta('7 days'),
        df_stocks['Date'].max() + pd.to_timedelta('5 W')
    ],
    yaxis_range=[
        70, 230
    ],
    showlegend=False,
    yaxis_tickprefix="$",
    yaxis_showtickprefix="all"  # or "first" or "last"
)

# Add vertical lines and annotations
fig.add_vline(
    x="2022-11-03",
    line=dict(
        color='#999999',
        width=1,
    ),
    line_dash="dot",
    opacity=1
)

fig.add_annotation(
    x="2022-11-03",
    xanchor="left",
    xshift=5,
    y=1,
    yanchor="top",
    yref="paper",
    text="<b>11/03/2022</b><br>\
Adidas' FIFA campaign:<br>\
Messi and others in augmented reality",
    align="left",
    font=dict(
        size=12
    ),
    showarrow=False,
)

fig.add_vline(
    x="2023-07-15",
    line=dict(
        color="#999999",
        width=1
    ),
    line_dash="dot",
    opacity=1,
)

fig.add_annotation(
    x="2023-07-15",
    xanchor="left",
    xshift=5,
    y=1,
    yanchor="top",
    yref="paper",
    text="<b>07/15/2023</b><br>\
Messi announces<br>\
MLS deal with Miami",
    showarrow=False,
    align="left",
    font=dict(
        size=12
    )
)

from PIL import Image

adidas_logo_img = Image.open('images/logo_adidas_blue.png')
nike_logo_img = Image.open('images/logo_nike_orange.png')

fig.add_layout_image(
    dict(
        source=adidas_logo_img,
        xref="paper", yref="paper",
        x=1, y=0.66,
        sizex=0.07, sizey=0.1,
        xanchor="right", yanchor="bottom"
    )
)

fig.add_layout_image(
    dict(
        source=nike_logo_img,
        xref="paper", yref="paper",
        x=0.99, y=0.17,
        sizex=0.1, sizey=0.035,
        xanchor="right", yanchor="bottom"
    )
)

fig.show()

Conclusion

While LLMs can be a valuable tool for visual generation, they may not always produce perfect results. Both Gemini and ChatGPT proved effective in refining visuals when provided with specific instructions to edit the Plotly chart. While LIDA is promising, it currently lacks sufficient documentation and is too slow compared to directly using Gemini or ChatGPT.

Creating a starter code using LLMs and then manually tweaking it proved to be the most effective approach. While I still needed to consult Google for minor adjustments like removing gridlines or setting line colors in Plotly, using LLMs to lay the foundation significantly saved time.

Given the current capabilities of LLMs, creating a truly well-refined visual will likely still require human oversight.