At AWS re:Invent 2022, we previewed Amazon SageMaker geospatial capabilities, allowing data scientists and machine learning (ML) engineers to build, train, and deploy ML models using geospatial data. Geospatial ML with Amazon SageMaker supports access to readily available geospatial data, purpose-built processing operations and open source libraries, pre-trained ML models, and built-in visualization tools with Amazon SageMaker’s geospatial capabilities.
During the preview, we had lots of interest and great feedback from customers. Today, Amazon SageMaker geospatial capabilities are generally available with new security updates and additional sample use cases.
Introducing Geospatial ML features with SageMaker Studio
To get started, use the quick setup to launch Amazon SageMaker Studio in the US West (Oregon) Region. Make sure to use the default Jupyter Lab 3 version when you create a new user in the Studio. Now you can navigate to the homepage in SageMaker Studio. Then select the Data menu and click on Geospatial.
Here is an overview of three key Amazon SageMaker geospatial capabilities:
- Earth Observation jobs – Acquire, transform, and visualize satellite imagery data using purpose-built geospatial operations or pre-trained ML models to make predictions and get useful insights.
- Vector Enrichment jobs – Enrich your data with operations, such as converting geographical coordinates to readable addresses.
- Map Visualization – Visualize satellite images or map data uploaded from a CSV, JSON, or GeoJSON file.
You can create all Earth Observation Jobs (EOJ) in the SageMaker Studio notebook to process satellite data using purpose-built geospatial operations. Here is a list of purpose-built geospatial operations that are supported by the SageMaker Studio notebook:
- Band Stacking – Combine multiple spectral properties to create a single image.
- Cloud Masking – Identify cloud and cloud-free pixels to get improved and accurate satellite imagery.
- Cloud Removal – Remove pixels containing parts of a cloud from satellite imagery.
- Geomosaic – Combine multiple images for greater fidelity.
- Land Cover Segmentation – Identify land cover types such as vegetation and water in satellite imagery.
- Resampling – Scale images to different resolutions.
- Spectral Index – Obtain a combination of spectral bands that indicate the abundance of features of interest.
- Temporal Statistics – Calculate statistics through time for multiple GeoTIFFs in the same area.
- Zonal Statistics – Calculate statistics on user-defined regions.
A Vector Enrichment Job (VEJ) enriches your location data through purpose-built operations for reverse geocoding and map matching. While you need to use a SageMaker Studio notebook to execute a VEJ, you can view all the jobs you create using the user interface. To use the visualization in the notebook, you first need to export your output to your Amazon S3 bucket.
- Reverse Geocoding – Convert coordinates (latitude and longitude) to human-readable addresses.
- Map Matching – Snap inaccurate GPS coordinates to road segments.
Using the Map Visualization, you can visualize geospatial data, the inputs to your EOJ or VEJ jobs as well as the outputs exported from your Amazon Simple Storage Service (Amazon S3) bucket.
Security Updates
At GA, we have two major security updates—AWS Key Management Service (AWS KMS) for customer managed AWS KMS key support and Amazon Virtual Private Cloud (Amazon VPC) for geospatial operations in the customer Amazon VPC environment.
AWS KMS customer managed keys offer increased flexibility and control by enabling customers to use their own keys to encrypt geospatial workloads.
You can use KmsKeyId
to specify your own key in StartEarthObservationJob
and StartVectorEnrichmentJob
as an optional parameter. If the customer doesn’t provide KmsKeyId
, a service owned key will be used to encrypt the customer content. To learn more, see SageMaker geospatial capabilities AWS KMS Support in the AWS documentation.
Using Amazon VPC, you have full control over your network environment and can more securely connect to your geospatial workloads on AWS. You can use SageMaker Studio or Notebook in your Amazon VPC environment for SageMaker geospatial operations and execute SageMaker geospatial API operations through an interface VPC endpoint in SageMaker geospatial operations.
To get started with Amazon VPC support, configure Amazon VPC on SageMaker Studio Domain and create a SageMaker geospatial VPC endpoint in your VPC in the Amazon VPC console. Choose the service name as com.amazonaws.us-west-2.sagemaker-geospatial
and select the VPC in which to create the VPC endpoint.
All Amazon S3 resources that are used for input or output in EOJ and VEJ operations should have internet access enabled. If you have no direct access to those Amazon S3 resources via the internet, you can grant SageMaker geospatial VPC endpoint ID access to it by changing the corresponding S3 bucket policy. To learn more, see SageMaker geospatial capabilities Amazon VPC Support in the AWS documentation.
Example Use Case for Geospatial ML
Customers across various industries use Amazon SageMaker geospatial capabilities for real-world applications.
Maximize Harvest Yield and Food Security
Digital farming consists of applying digital solutions to help farmers optimize crop production in agriculture through the use of advanced analytics and machine learning. Digital farming applications require working with geospatial data, including satellite imagery of the areas where farmers have their fields located.
You can use SageMaker to identify farm field boundaries in satellite imagery through pre-trained models for land cover classification. Learn about How Xarvio accelerated pipelines of spatial data for digital farming with Amazon SageMaker Geospatial in the AWS Machine Learning Blog. You can find an end-to-end digital farming example notebook via the GitHub repository.
Damage Assessment
As the frequency and severity of natural disasters increase, it’s important that we equip decision-makers and first responders with fast and accurate damage assessment. You can use geospatial imagery to predict natural disaster damage and geospatial data in the immediate aftermath of a natural disaster to rapidly identify damage to buildings, roads, or other critical infrastructure.
From an example notebook, you can train, deploy, and predict natural disaster damage from the floods in Rochester, Australia, in mid-October 2022. We use images from before and after the disaster as input to its trained ML model. The results of the segmentation mask for the Rochester floods are shown in the following images. Here we can see that the model has identified locations within the flooded region as likely damaged.
You can train and deploy a geospatial segmentation model to assess wildfire damages using multi-temporal Sentinel-2 satellite data via GitHub repository. The area of interest for this example is located in Northern California, from a region that was affected by the Dixie Wildfire in 2021.
Monitor Climate Change
Earth’s climate change increases the risk of drought due to global warming. You can see how to acquire data, perform analysis, and visualize the changes with SageMaker geospatial capabilities to monitor shrinking shoreline caused by climate change in the Lake Mead example, the largest reservoir in the US.
You can find the notebook code for this example in the GitHub repository.
Predict Retail Demand
The new notebook example demonstrates how to use SageMaker geospatial capabilities to perform a vector-based map-matching operation and visualize the results. Map matching allows you to snap noisy GPS coordinates to road segments. With Amazon SageMaker geospatial capabilities, it is possible to perform a VEJ for map matching. This type of job takes a CSV file with route information (such as longitude, latitude, and timestamps of GPS measurements) as input and produces a GeoJSON file that contains the predicted route.
Support Sustainable Urban Development
Arup, one of our customers, uses digital technologies like machine learning to explore the impact of heat on urban areas and the factors that influence local temperatures to deliver better design and support sustainable outcomes. Urban Heat Islands and the associated risks and discomforts are one of the biggest challenges cities are facing today.
Using Amazon SageMaker geospatial capabilities, Arup identifies and measures urban heat factors with earth observation data, which significantly accelerated their ability to counsel clients. It enabled its engineering teams to carry out analytics that weren’t possible previously by providing access to increased volumes, types, and analysis of larger datasets. To learn more, see Facilitating Sustainable City Design Using Amazon SageMaker with Arup in AWS customer stories.
Now Available
Amazon SageMaker geospatial capabilities are now generally available in the US West (Oregon) Region. As part of the AWS Free Tier, you can get started with SageMaker geospatial capabilities for free. The Free Tier lasts 30 days and includes 10 free ml.geospatial.interactive compute hours, up to 10 GB of free storage, and no $150 monthly user fee.
After the 30-day free trial period is complete, or if you exceed the Free Tier limits defined above, you pay for the components outlined on the pricing page.
To learn more, see Amazon SageMaker geospatial capabilities and the Developer Guide. Give it a try and send feedback to AWS re:Post for Amazon SageMaker or through your usual AWS support contacts.
– Channy