Streamlit + Readme update: copy to cURL (#22)

2025-09-01 18:20:06 +00:00 · 2024-03-04 12:41:38 -05:00 · 2024-03-04 12:41:38 -05:00 · 3bf56717c9
commit 3bf56717c9
parent 0495552b11
6 changed files with 79 additions and 34 deletions
--- a/README.md
+++ b/README.md
@ -27,8 +27,35 @@
  <img src="images/geico_shu_recording_cropped.gif"/>
 </p>

-Want to see more examples of Skyvern in action? Click [here](#real-world-examples-of-skyvern)!
+Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.

+Instead of only relying on code-defined XPath interactions, Skyvern adds computer vision and LLMs to the mix to parse items in the viewport in real-time, create a plan for interaction and interact with them.
+
+This approach gives us a few advantages:
+
+1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
+1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
+1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
+    1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
+    1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
+
+
+Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyvern](#real-world-examples-of-skyvern)
+
+
+# How it works
+Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
+
+<picture>
+  <source media="(prefers-color-scheme: dark)" srcset="images/skyvern-system-diagram-dark.png" />
+  <img src="images/skyvern-system-diagram-light.png" />
+</picture>
+
+<!-- TODO (suchintan): 
+Expand the diagram above to go deeper into how:
+1. We draw bounding boxes
+2. We parse the HTML + extract the image to generate an interactable element map
+-->

 # Quickstart
 This quickstart guide will walk you through getting Skyvern up and running on your local machine. 
@ -72,20 +99,26 @@ pre-commit install

 ## Running your first automation

+### Executing tasks (UI)
+Once you have the UI running, you can start an automation by filling out the fields shown in the UI and clicking "Execute" 

-# How it works
-Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major difference: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
+<p align="center">
+  <img src="images/skyvern_visualizer_run_task.png"/>
+</p>

-<picture>
-  <source media="(prefers-color-scheme: dark)" srcset="images/skyvern-system-diagram-dark.png"/>
-  <img src="images/skyvern-system-diagram-light.png"/>
-</picture>
+### Executing tasks (cURL)
+
+```
+curl -X POST -H 'Content-Type: application/json' -H 'x-api-key: {Your local API key}' -d '{
+    "url": "https://www.geico.com",
+    "webhook_callback_url": "",
+    "navigation_goal": "Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved",
+    "data_extraction_goal": "Extract all quote information in JSON format including the premium amount, the timeframe for the quote.",
+    "navigation_payload": "{Your data here}",
+    "proxy_location": "NONE"
+}' http://0.0.0.0:8000/api/v1/tasks
+```

-<!-- > TODO (suchintan): 
-Expand the diagram above to go deeper into how:
-1. We draw bounding boxes
-2. We parse the HTML + extract the image to generate an interactable element map
-->

 # Real-world examples of Skyvern
 <!-- > TODO (suchintan):
@ -123,18 +156,6 @@ More extensive documentation can be found on our [documentation website](https:/

 Our focus is bringing stability to browser-based workflows. We leverage LLMs to create an AI Agent capable of interacting with websites like you or I would — all via a simple API call.

-Traditional approaches required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
-
-Skyvern operates like a human — increasing reliability by not relying on fragile scripts, instead relying on computer vision to parse items in the viewport and interact with them the way a human would.
-
-This approach gives us a few advantages:
-
-1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
-1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
-1. Skyvern is able to circumvent or navigate through many bot detection methods as many of them rely on allowing people to access the websites
-1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
-    1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
-    1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)


 # Feature Roadmap
--- a/images/skyvern_visualizer_run_task.png
+++ b/images/skyvern_visualizer_run_task.png
--- a/pyproject.toml
+++ b/pyproject.toml
@ -43,6 +43,8 @@ asyncache = "^0.3.1"
 orjson = "^3.9.10"
 structlog = "^23.2.0"
 plotly = "^5.18.0"
+clipboard = "^0.0.4"
+curlify = "^2.2.1"


 [tool.poetry.group.dev.dependencies]
@ -66,6 +68,7 @@ notebook = "^7.0.6"
 freezegun = "^1.2.2"
 snoop = "^0.4.3"
 rich = {extras = ["jupyter"], version = "^13.7.0"}
+clipboard = "^0.0.4"


 [build-system]
--- a/streamlit_app/visualizer/api.py
+++ b/streamlit_app/visualizer/api.py
@ -1,7 +1,9 @@
 import json
 from typing import Any

+import curlify
 import requests
+from requests import PreparedRequest

 from skyvern.forge.sdk.schemas.tasks import TaskRequest

@ -11,7 +13,7 @@ class SkyvernClient:
        self.base_url = base_url
        self.credentials = credentials

-    def create_task(self, task_request_body: TaskRequest) -> str | None:
+    def generate_curl_params(self, task_request_body: TaskRequest) -> PreparedRequest:
        url = f"{self.base_url}/tasks"
        payload = task_request_body.model_dump()
        headers = {
@ -19,11 +21,23 @@ class SkyvernClient:
            "x-api-key": self.credentials,
        }

+        return url, payload, headers
+
+    def create_task(self, task_request_body: TaskRequest) -> str | None:
+        url, payload, headers = self.generate_curl_params(task_request_body)
+
        response = requests.post(url, headers=headers, data=json.dumps(payload))
        if "task_id" not in response.json():
            return None
        return response.json()["task_id"]

+    def copy_curl(self, task_request_body: TaskRequest) -> str:
+        url, payload, headers = self.generate_curl_params(task_request_body)
+
+        req = requests.Request("POST", url, headers=headers, data=json.dumps(payload, indent=4))
+
+        return curlify.to_curl(req.prepare())
+
    def get_task(self, task_id: str) -> dict[str, Any] | None:
        """Get a task by id."""
        url = f"{self.base_url}/internal/tasks/{task_id}"
--- a/streamlit_app/visualizer/sample_data.py
+++ b/streamlit_app/visualizer/sample_data.py
@ -1,16 +1,11 @@
-from pydantic import BaseModel
+from skyvern.forge.sdk.schemas.tasks import TaskRequest


-class SampleData(BaseModel):
+class SampleTaskRequest(TaskRequest):
    name: str
-    url: str
-    navigation_goal: str
-    data_extraction_goal: str
-    navigation_payload: dict
-    extracted_information_schema: dict


-geico_sample_data = SampleData(
+geico_sample_data = SampleTaskRequest(
    name="Geico",
    url="https://www.geico.com",
    navigation_goal="Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved",
--- a/streamlit_app/visualizer/streamlit.py
+++ b/streamlit_app/visualizer/streamlit.py
@ -1,3 +1,4 @@
+import clipboard
 import pandas as pd
 import streamlit as st

@ -104,6 +105,11 @@ st.markdown("# **:dragon: Skyvern :dragon:**")
 st.markdown(f"### **{select_env} - {select_org}**")
 execute_tab, visualizer_tab = st.tabs(["Execute", "Visualizer"])

+
+def copy_curl_to_clipboard(task_request_body: TaskRequest) -> None:
+    clipboard.copy(client.copy_curl(task_request_body=task_request_body))
+
+
 with execute_tab:
    example_tabs = st.tabs([supported_example.name for supported_example in supported_examples])

@ -111,8 +117,14 @@ with execute_tab:
        with example_tab:
            create_column, explanation_column = st.columns([1, 2])
            with create_column:
+                run_task, copy_curl = st.columns([3, 1])
+                task_request_body = supported_examples[i]
+                copy_curl.button(
+                    "Copy cURL", on_click=lambda: copy_curl_to_clipboard(task_request_body=task_request_body)
+                )
                with st.form("task_form"):
-                    st.markdown("## Run a task")
+                    run_task.markdown("## Run a task")
+
                    example = supported_examples[i]
                    # Create all the fields to create a TaskRequest object
                    st_url = st.text_input("URL*", value=example.url, key="url")