Amazon Nova Act is a research preview released by Amazon Artificial General Intelligence (AGI) that enables developers to build agents capable of taking actions within web browsers. This technology combines natural language instructions with Python scripting and Playwright automation to navigate websites, click buttons, fill forms, and extract data dynamically.
Unlike traditional web automation tools that rely on brittle scripts and website-specific code, Nova Act uses AI to interact with websites more adaptively, helping it handle changes in web interfaces without requiring constant maintenance.

With its intuitive interface, collaborative features, and powerful automation capabilities, Apidog significantly reduces development time while improving API quality.

Prerequisites
Before getting started with Amazon Nova Act, you need:
- Operating System: MacOS or Ubuntu (Windows is not currently supported)
- Python: Version 3.10 or above
- Amazon Account: For generating an API key
- Location Requirement: Amazon Nova Act is currently only available as a research preview in the US
Getting Your Amazon Nova Act API Key
To use Amazon Nova Act:
- Navigate to nova.amazon.com/act and sign in with your Amazon account
- Select "Act" in the Labs section of the navigation pane
- Generate an API key
- If access isn't immediate, you may be placed on a waitlist and notified by email when granted access
Installation
Once you have your API key:
# Install the SDK
pip install nova-act
# Set your API key as an environment variable
export NOVA_ACT_API_KEY="your_api_key"
Note: The first time you run Nova Act, it may take 1-2 minutes to start as it installs Playwright modules. Subsequent runs will start more quickly.
Basic Usage
Let's start with a simple example directly from the documentation:
from nova_act import NovaAct
with NovaAct(starting_page="https://www.amazon.com") as nova:
nova.act("search for a coffee maker")
nova.act("select the first result")
nova.act("scroll down or up until you see 'add to cart' and then click 'add to cart'")
This script will:
- Open Chrome and navigate to Amazon
- Search for coffee makers
- Select the first result
- Find and click the "Add to Cart" button
Interactive Mode
Nova Act can be used interactively for experimentation:
# Start Python shell
$ python
>>> from nova_act import NovaAct
>>> nova = NovaAct(starting_page="https://www.amazon.com")
>>> nova.start()
>>> nova.act("search for a coffee maker")
After the first action completes, continue with the next step:
>>> nova.act("select the first result")
Note that according to the documentation, Nova Act does not currently support iPython; use the standard Python shell instead.
Effective Prompting Strategies
The official documentation emphasizes breaking tasks into smaller steps for reliability:
1. Be Specific and Clear
❌ DON'T
nova.act("From my order history, find my most recent order from India Palace and reorder it")
✅ DO
nova.act("Click the hamburger menu icon, go to Order History, find my most recent order from India Palace and reorder it")
2. Break Complex Tasks into Smaller Steps
❌ DON'T
nova.act("book me a hotel that costs less than $100 with the highest star rating")
✅ DO
nova.act(f"search for hotels in Houston between {startdate} and {enddate}")
nova.act("sort by avg customer review")
nova.act("hit book on the first hotel that is $100 or less")
nova.act(f"fill in my name, address, and DOB according to {blob}")
Extracting Data from Web Pages
Nova Act supports structured data extraction using Pydantic models:
from pydantic import BaseModel
from nova_act import NovaAct, BOOL_SCHEMA
class Book(BaseModel):
title: str
author: str
class BookList(BaseModel):
books: list[Book]
def get_books(year: int) -> BookList | None:
"""Get top NYT books of the year and return as a BookList."""
with NovaAct(starting_page=f"https://en.wikipedia.org/wiki/List_of_The_New_York_Times_number-one_books_of_{year}#Fiction") as nova:
result = nova.act(
"Return the books in the Fiction list",
schema=BookList.model_json_schema()
)
if not result.matches_schema:
# Act response did not match the schema
return None
# Parse the JSON into the pydantic model
book_list = BookList.model_validate(result.parsed_response)
return book_list
For simple boolean responses, use the built-in BOOL_SCHEMA
:
result = nova.act("Am I logged in?", schema=BOOL_SCHEMA)
if result.matches_schema:
if result.parsed_response:
print("You are logged in")
else:
print("You are not logged in")
Parallel Processing with Multiple Browser Sessions
The GitHub documentation confirms that Nova Act supports parallel processing with multiple browser sessions:
from concurrent.futures import ThreadPoolExecutor, as_completed
from nova_act import NovaAct, ActError
# Accumulate results here
all_books = []
# Set maximum concurrent browser sessions
with ThreadPoolExecutor(max_workers=10) as executor:
# Get books from multiple years in parallel
future_to_books = {
executor.submit(get_books, year): year
for year in range(2010, 2025)
}
# Collect results
for future in as_completed(future_to_books.keys()):
try:
year = future_to_books[future]
book_list = future.result()
if book_list is not None:
all_books.extend(book_list.books)
except ActError as exc:
print(f"Skipping year {year} due to error: {exc}")
Authentication and Browser State
For websites requiring authentication, Nova Act provides options to use existing Chrome profiles:
import os
from nova_act import NovaAct
user_data_dir = "path/to/my/chrome_profile"
os.makedirs(user_data_dir, exist_ok=True)
with NovaAct(
starting_page="https://amazon.com/",
user_data_dir=user_data_dir,
clone_user_data_dir=False
) as nova:
input("Log into your websites, then press enter...")
There's also a built-in helper script for this purpose:
python -m nova_act.samples.setup_chrome_user_data_dir
Handling Sensitive Information
The documentation specifically warns about handling sensitive data:
# Sign in properly
nova.act("enter username janedoe and click on the password field")
# Use Playwright directly for sensitive data
nova.page.keyboard.type(getpass()) # getpass() collects password securely
# Continue after credentials are entered
nova.act("sign in")
Security Warning: The documentation notes that screenshots taken during execution will capture any visible sensitive information.
Additional Features
Working with Captchas
result = nova.act("Is there a captcha on the screen?", schema=BOOL_SCHEMA)
if result.matches_schema and result.parsed_response:
input("Please solve the captcha and hit return when done")
Downloading Files
with nova.page.expect_download() as download_info:
nova.act("click on the download button")
# Save permanently
download_info.value.save_as("my_downloaded_file")
Recording Sessions
nova = NovaAct(
starting_page="https://example.com",
logs_directory="/path/to/logs",
record_video=True
)
Real-World Example: Apartment Search
The dev.to article demonstrates a real-world example of finding apartments near a train station. Here's the core structure of that example:
from concurrent.futures import ThreadPoolExecutor, as_completed
import pandas as pd
from pydantic import BaseModel
from nova_act import NovaAct
class Apartment(BaseModel):
address: str
price: str
beds: str
baths: str
class ApartmentList(BaseModel):
apartments: list[Apartment]
class CaltrainBiking(BaseModel):
biking_time_hours: int
biking_time_minutes: int
biking_distance_miles: float
# First find apartments
with NovaAct(starting_page="https://zumper.com/", headless=headless) as client:
client.act(
"Close any cookie banners. "
f"Search for apartments near {caltrain_city}, CA, "
f"then filter for {bedrooms} bedrooms and {baths} bathrooms."
)
# Extract apartment listings with schema
result = client.act(
"Return the currently visible list of apartments",
schema=ApartmentList.model_json_schema()
)
# Then check biking distances in parallel
with ThreadPoolExecutor() as executor:
# Submit parallel tasks to check biking distance to train station
future_to_apartment = {
executor.submit(add_biking_distance, apartment, caltrain_city, headless): apartment
for apartment in all_apartments
}
# Process results
for future in as_completed(future_to_apartment.keys()):
# Collect and process results
pass
# Sort and display results
apartments_df = pd.DataFrame(apartments_with_biking)
This example demonstrates how Nova Act can:
- Extract structured data from websites
- Process multiple browser sessions in parallel
- Combine information from different sources
Known Limitations
According to the documentation, Nova Act has these limitations:
- Browser-Only: Cannot interact with non-browser applications
- Limited Reliability: May struggle with high-level prompts
- UI Constraints: Cannot interact with elements hidden behind mouseovers
- Browser Modals: Cannot interact with browser window modals like permission requests
- Geography Limitation: Currently only available in the US
- Research Status: This is an experimental preview, not a production service
NovaAct Constructor Options
The documentation lists these parameters for initializing NovaAct:
NovaAct(
starting_page="https://example.com", # Required: URL to start at
headless=False, # Whether to run browser visibly or not
quiet=False, # Whether to suppress logs
user_data_dir=None, # Path to Chrome profile
nova_act_api_key=None, # API key (can use env var instead)
logs_directory=None, # Where to store logs
record_video=False, # Whether to record session
# Other options as documented
)
Conclusion
Amazon Nova Act represents an innovative approach to browser automation by combining AI with traditional automation techniques. While still in research preview with some limitations, it offers a promising direction for making web automation more reliable and adaptable.
The key advantage of Nova Act is its ability to break down complex browser interactions into discrete, reliable steps using natural language instructions, which can be combined with Python code for flexible, powerful automation workflows.
As this is a research preview available only in the US, expect ongoing changes and improvements. For the most current information, always refer to the official documentation at GitHub and nova.amazon.com/act.