mirror of
https://github.com/Skyvern-AI/skyvern.git
synced 2025-09-01 18:20:06 +00:00
Streamlit + Readme update: copy to cURL (#22)
This commit is contained in:
parent
0495552b11
commit
3bf56717c9
6 changed files with 79 additions and 34 deletions
69
README.md
69
README.md
|
@ -27,8 +27,35 @@
|
|||
<img src="images/geico_shu_recording_cropped.gif"/>
|
||||
</p>
|
||||
|
||||
Want to see more examples of Skyvern in action? Click [here](#real-world-examples-of-skyvern)!
|
||||
Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
|
||||
|
||||
Instead of only relying on code-defined XPath interactions, Skyvern adds computer vision and LLMs to the mix to parse items in the viewport in real-time, create a plan for interaction and interact with them.
|
||||
|
||||
This approach gives us a few advantages:
|
||||
|
||||
1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
|
||||
1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
|
||||
1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
|
||||
1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
|
||||
1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
|
||||
|
||||
|
||||
Want to see examples of Skyvern in action? Jump to [#real-world-examples-of-skyvern](#real-world-examples-of-skyvern)
|
||||
|
||||
|
||||
# How it works
|
||||
Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
|
||||
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" srcset="images/skyvern-system-diagram-dark.png" />
|
||||
<img src="images/skyvern-system-diagram-light.png" />
|
||||
</picture>
|
||||
|
||||
<!-- TODO (suchintan):
|
||||
Expand the diagram above to go deeper into how:
|
||||
1. We draw bounding boxes
|
||||
2. We parse the HTML + extract the image to generate an interactable element map
|
||||
-->
|
||||
|
||||
# Quickstart
|
||||
This quickstart guide will walk you through getting Skyvern up and running on your local machine.
|
||||
|
@ -72,20 +99,26 @@ pre-commit install
|
|||
|
||||
## Running your first automation
|
||||
|
||||
### Executing tasks (UI)
|
||||
Once you have the UI running, you can start an automation by filling out the fields shown in the UI and clicking "Execute"
|
||||
|
||||
# How it works
|
||||
Skyvern was inspired by the Task-Driven autonomous agent design popularized by [BabyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) -- with one major difference: we give Skyvern the ability to interact with websites using browser automation libraries like [Playwright](https://playwright.dev/).
|
||||
<p align="center">
|
||||
<img src="images/skyvern_visualizer_run_task.png"/>
|
||||
</p>
|
||||
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" srcset="images/skyvern-system-diagram-dark.png"/>
|
||||
<img src="images/skyvern-system-diagram-light.png"/>
|
||||
</picture>
|
||||
### Executing tasks (cURL)
|
||||
|
||||
```
|
||||
curl -X POST -H 'Content-Type: application/json' -H 'x-api-key: {Your local API key}' -d '{
|
||||
"url": "https://www.geico.com",
|
||||
"webhook_callback_url": "",
|
||||
"navigation_goal": "Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved",
|
||||
"data_extraction_goal": "Extract all quote information in JSON format including the premium amount, the timeframe for the quote.",
|
||||
"navigation_payload": "{Your data here}",
|
||||
"proxy_location": "NONE"
|
||||
}' http://0.0.0.0:8000/api/v1/tasks
|
||||
```
|
||||
|
||||
<!-- > TODO (suchintan):
|
||||
Expand the diagram above to go deeper into how:
|
||||
1. We draw bounding boxes
|
||||
2. We parse the HTML + extract the image to generate an interactable element map
|
||||
-->
|
||||
|
||||
# Real-world examples of Skyvern
|
||||
<!-- > TODO (suchintan):
|
||||
|
@ -123,18 +156,6 @@ More extensive documentation can be found on our [documentation website](https:/
|
|||
|
||||
Our focus is bringing stability to browser-based workflows. We leverage LLMs to create an AI Agent capable of interacting with websites like you or I would — all via a simple API call.
|
||||
|
||||
Traditional approaches required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
|
||||
|
||||
Skyvern operates like a human — increasing reliability by not relying on fragile scripts, instead relying on computer vision to parse items in the viewport and interact with them the way a human would.
|
||||
|
||||
This approach gives us a few advantages:
|
||||
|
||||
1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code
|
||||
1. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
|
||||
1. Skyvern is able to circumvent or navigate through many bot detection methods as many of them rely on allowing people to access the websites
|
||||
1. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include:
|
||||
1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16
|
||||
1. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!)
|
||||
|
||||
|
||||
# Feature Roadmap
|
||||
|
|
BIN
images/skyvern_visualizer_run_task.png
Normal file
BIN
images/skyvern_visualizer_run_task.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 238 KiB |
|
@ -43,6 +43,8 @@ asyncache = "^0.3.1"
|
|||
orjson = "^3.9.10"
|
||||
structlog = "^23.2.0"
|
||||
plotly = "^5.18.0"
|
||||
clipboard = "^0.0.4"
|
||||
curlify = "^2.2.1"
|
||||
|
||||
|
||||
[tool.poetry.group.dev.dependencies]
|
||||
|
@ -66,6 +68,7 @@ notebook = "^7.0.6"
|
|||
freezegun = "^1.2.2"
|
||||
snoop = "^0.4.3"
|
||||
rich = {extras = ["jupyter"], version = "^13.7.0"}
|
||||
clipboard = "^0.0.4"
|
||||
|
||||
|
||||
[build-system]
|
||||
|
|
|
@ -1,7 +1,9 @@
|
|||
import json
|
||||
from typing import Any
|
||||
|
||||
import curlify
|
||||
import requests
|
||||
from requests import PreparedRequest
|
||||
|
||||
from skyvern.forge.sdk.schemas.tasks import TaskRequest
|
||||
|
||||
|
@ -11,7 +13,7 @@ class SkyvernClient:
|
|||
self.base_url = base_url
|
||||
self.credentials = credentials
|
||||
|
||||
def create_task(self, task_request_body: TaskRequest) -> str | None:
|
||||
def generate_curl_params(self, task_request_body: TaskRequest) -> PreparedRequest:
|
||||
url = f"{self.base_url}/tasks"
|
||||
payload = task_request_body.model_dump()
|
||||
headers = {
|
||||
|
@ -19,11 +21,23 @@ class SkyvernClient:
|
|||
"x-api-key": self.credentials,
|
||||
}
|
||||
|
||||
return url, payload, headers
|
||||
|
||||
def create_task(self, task_request_body: TaskRequest) -> str | None:
|
||||
url, payload, headers = self.generate_curl_params(task_request_body)
|
||||
|
||||
response = requests.post(url, headers=headers, data=json.dumps(payload))
|
||||
if "task_id" not in response.json():
|
||||
return None
|
||||
return response.json()["task_id"]
|
||||
|
||||
def copy_curl(self, task_request_body: TaskRequest) -> str:
|
||||
url, payload, headers = self.generate_curl_params(task_request_body)
|
||||
|
||||
req = requests.Request("POST", url, headers=headers, data=json.dumps(payload, indent=4))
|
||||
|
||||
return curlify.to_curl(req.prepare())
|
||||
|
||||
def get_task(self, task_id: str) -> dict[str, Any] | None:
|
||||
"""Get a task by id."""
|
||||
url = f"{self.base_url}/internal/tasks/{task_id}"
|
||||
|
|
|
@ -1,16 +1,11 @@
|
|||
from pydantic import BaseModel
|
||||
from skyvern.forge.sdk.schemas.tasks import TaskRequest
|
||||
|
||||
|
||||
class SampleData(BaseModel):
|
||||
class SampleTaskRequest(TaskRequest):
|
||||
name: str
|
||||
url: str
|
||||
navigation_goal: str
|
||||
data_extraction_goal: str
|
||||
navigation_payload: dict
|
||||
extracted_information_schema: dict
|
||||
|
||||
|
||||
geico_sample_data = SampleData(
|
||||
geico_sample_data = SampleTaskRequest(
|
||||
name="Geico",
|
||||
url="https://www.geico.com",
|
||||
navigation_goal="Navigate through the website until you generate an auto insurance quote. Do not generate a home insurance quote. If this page contains an auto insurance quote, consider the goal achieved",
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
import clipboard
|
||||
import pandas as pd
|
||||
import streamlit as st
|
||||
|
||||
|
@ -104,6 +105,11 @@ st.markdown("# **:dragon: Skyvern :dragon:**")
|
|||
st.markdown(f"### **{select_env} - {select_org}**")
|
||||
execute_tab, visualizer_tab = st.tabs(["Execute", "Visualizer"])
|
||||
|
||||
|
||||
def copy_curl_to_clipboard(task_request_body: TaskRequest) -> None:
|
||||
clipboard.copy(client.copy_curl(task_request_body=task_request_body))
|
||||
|
||||
|
||||
with execute_tab:
|
||||
example_tabs = st.tabs([supported_example.name for supported_example in supported_examples])
|
||||
|
||||
|
@ -111,8 +117,14 @@ with execute_tab:
|
|||
with example_tab:
|
||||
create_column, explanation_column = st.columns([1, 2])
|
||||
with create_column:
|
||||
run_task, copy_curl = st.columns([3, 1])
|
||||
task_request_body = supported_examples[i]
|
||||
copy_curl.button(
|
||||
"Copy cURL", on_click=lambda: copy_curl_to_clipboard(task_request_body=task_request_body)
|
||||
)
|
||||
with st.form("task_form"):
|
||||
st.markdown("## Run a task")
|
||||
run_task.markdown("## Run a task")
|
||||
|
||||
example = supported_examples[i]
|
||||
# Create all the fields to create a TaskRequest object
|
||||
st_url = st.text_input("URL*", value=example.url, key="url")
|
||||
|
|
Loading…
Add table
Reference in a new issue