I built an architecture that automates the thawing of genomic files stored in AWS Glacier when free-tier users upgrade to premium. This architecture leverages several AWS services, including SNS, SQS, Lambda, DynamoDB, and S3, ensuring efficient and reliable file retrieval and storage.
Key Components and Workflow:
1. Premium User Upgrade Event:
When a user links their Stripe account to the Genomics Analysis Service (GAS) and upgrades to premium, an SNS topic navyavedachala_a16_start_thaw is triggered, publishing a message about the upgrade.
2. Thaw Endpoint Subscription:
The /thaw API endpoint is subscribed to the navyavedachala_a16_start_thaw SNS topic. When a user upgrades, it receives the notification and begins processing.
3. Polling from SQS Queue:
An SQS queue navyavedachala_a16_start_thaw, subscribed to the SNS topic, polls for messages. By using SQS, I ensured that messages are processed at least once, increasing system reliability in case of downtime or errors.
4. Querying DynamoDB for Archived Files:
Within the /thaw endpoint, the DynamoDB table is queried for jobs that have an associated results_file_archive_id, indicating archived files from free-tier users. I optimized this query by avoiding a full table scan, keeping the cost minimal.
5. File Information Extraction:
The query also retrieves s3_key_result_file and job_id for the corresponding jobs. This metadata is passed into the Glacier retrieval process to track the files and ensure accurate restoration.
6. Initiating Glacier Job:
For each archived file, I initiate a Glacier retrieval job using the glacier_client.initiate_job function.
7. Receiving Glacier Completion Notification:
An SNS topic navyavedachala_a16_complete_thaw is configured to receive messages from Glacier when a file has been successfully retrieved. The SQS queue navyavedachala_a16_complete_thaw is subscribed to this topic and captures these completion messages.
8. Lambda Function for File Restoration:
The Lambda function navyavedachala_a16_restore is triggered by messages in the SQS queue.
9. Cleanup and Archive Deletion:
Glacier supplies a glacier_job_id when it finishes retrieving. This glacier_job_id is used by the Lambda function to save the file retrieved to S3. The Lambda function also deletes the archived Glacier file and removes the message from the SQS queue, completing the thawing process.
10. Frontend Integration:
The frontend reflects the file restoration status
I built an architecture that automates the thawing of genomic files stored in AWS Glacier when free-tier users upgrade to premium. This architecture leverages several AWS services, including SNS, SQS, Lambda, DynamoDB, and S3, ensuring efficient and reliable file retrieval and storage.
Key Components and Workflow:
1. Premium User Upgrade Event:
When a user links their Stripe account to the Genomics Analysis Service (GAS) and upgrades to premium, an SNS topic navyavedachala_a16_start_thaw is triggered, publishing a message about the upgrade.
2. Thaw Endpoint Subscription:
The /thaw API endpoint is subscribed to the navyavedachala_a16_start_thaw SNS topic. When a user upgrades, it receives the notification and begins processing.
3. Polling from SQS Queue:
An SQS queue navyavedachala_a16_start_thaw, subscribed to the SNS topic, polls for messages. By using SQS, I ensured that messages are processed at least once, increasing system reliability in case of downtime or errors.
4. Querying DynamoDB for Archived Files:
Within the /thaw endpoint, the DynamoDB table is queried for jobs that have an associated results_file_archive_id, indicating archived files from free-tier users. I optimized this query by avoiding a full table scan, keeping the cost minimal.
5. File Information Extraction:
The query also retrieves s3_key_result_file and job_id for the corresponding jobs. This metadata is passed into the Glacier retrieval process to track the files and ensure accurate restoration.
6. Initiating Glacier Job:
For each archived file, I initiate a Glacier retrieval job using the glacier_client.initiate_job function.
7. Receiving Glacier Completion Notification:
An SNS topic navyavedachala_a16_complete_thaw is configured to receive messages from Glacier when a file has been successfully retrieved. The SQS queue navyavedachala_a16_complete_thaw is subscribed to this topic and captures these completion messages.
8. Lambda Function for File Restoration:
The Lambda function navyavedachala_a16_restore is triggered by messages in the SQS queue.
9. Cleanup and Archive Deletion:
Glacier supplies a glacier_job_id when it finishes retrieving. This glacier_job_id is used by the Lambda function to save the file retrieved to S3. The Lambda function also deletes the archived Glacier file and removes the message from the SQS queue, completing the thawing process.
10. Frontend Integration:
The frontend reflects the file restoration status