Intra day reporting solution
Overview
The intra-day reporting system processes orders data 4 times per hour, refreshing data on a rolling basis for orders in the preceding hour.
Each day, Amazon generates 96 individual files containing order data, which includes both same-day orders or from previous days that experienced delayed processing. This continuous, around-the-clock publishing cycle ensures merchants have access to up-to-date order data.
This solution leverages advanced AWS technologies to ensure data security and streamline data distribution. Specifically:
- Data encryption is handled securely using AWS Key Management Service (KMS).
- Merchants receive instant notifications about newly published files through Amazon Simple Notification Service (SNS) topics.
- For data consumption, merchants can choose between two methods: a. Directly accessing and copying data from the JWO S3 bucket using the AWS Command Line Interface (CLI). b. Implementing an event-driven approach that automatically copies data from the JWO S3 bucket, utilizing Amazon Simple Queue Service (SQS) triggers in conjunction with AWS Lambda functions.
Solution overview
1- Amazon SNS triggers a notification event to the customer of a new file available
2- When the message is received in the SQS queue a lambda function will be triggered to process the incoming message
{
"Type" : "Notification",
"MessageId" : "34920377-24ac-5f27-8172-5dadc2dd506e",
"TopicArn" : "<Source SNS Topic Arn>",
"Subject" : "Amazon S3 Notification",
"Message" : "{\"Records\":[{\"eventVersion\":\"2.1\",\"eventSource\":\"aws:s3\",\"awsRegion\":\"us-east-1\",\"eventTime\":\"2024-06-05T22:18:24.822Z\",\"eventName\":
\"ObjectCreated:CompleteMultipartUpload\",\"userIdentity\":{\"principalId\":\"AWS:AROA5G7Q7N2GZSXO3I4Q6:RedshiftIamRoleSession\"},\"requestParameters\":
{\"sourceIPAddress\":\"54.81.145.214\"},\"responseElements\":{\"x-amz-request-id\":\"F7P4FE2Q4QXCW3JH\",\"x-amz-id-2\":\"aFW9gjd1vmA5sb1UrGJa1F6ATsRO2HLzjya/
UjdjtXnTdlP6exi83YMQPZ2kLXF3296kr3sUgNSpeERH1L/CcayWUIOeGMVT\"},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"<S3ReportNotifsConfigID>\",
\"bucket\":{\"name\":\"<S3BucketNameProvided>\",\"ownerIdentity\":{\"principalId\":\"AO07TJS0N6VFV\"},\"arn\":\"arn:aws:s3:::<S3BucketNameProvided>\"},\"object\":
{\"key\":\"Order/<ReportGenDate YYYY-MM-DD>/<ReportGenHour H24>/<ReportFileName>\",\"size\":166,\"eTag\":\"40dabe44630f69d6693151ea9d1aae44-1\",
\"versionId\":\"QywepIoqoFXvLzxs1vvAObS306zaU8e7\",\"sequencer\":\"006660E4307DDF229D\"}}}]}",
"Timestamp" : "2024-06-05T22:18:25.734Z",
"SignatureVersion" : "1",
"Signature" : "<SIGNATURE_VALUE>",
"SigningCertURL" : "<SIGNATURE_VALUE>",
"UnsubscribeURL" : "<UNSUSBSCRIBE_URL_VALUE>"
}
<Source SNS Topic Arn> = SNS topic that publishes the data. This is 1:1 mapped to a merchantID onboarded.
<S3ReportNotifsConfigID> = Event Generation name/ID maintained at the Amazon JWO end. This can be ignored by merchant.
<S3BucketNameProvided> = S3 that hosts the report files. This is 1:1 mapped to a merchantID onboarded.
<ReportGenDate YYYY-MM-DD> = This is the date to which report files corresponds to. Format of the date 'YYYY-MM-DD'
<ReportGenHour H24> = This is the hour to which report files corresponds to. This is in 24 hour format.
<ReportFileName> = Name of the reporting file. Generally it would be ALL<NN>.csv000, when <NN> represents the minute of the hour the file was generated
3- The lambda function will access the source file in the Amazon S3 bucket
4- The lambda function will copy the file from the Amazon S3 bucket and store it in the S3 in the customer bucket
Onboarding Instructions
Please see below instructions to onboard with Amazon JWO intra-day reporting. Note: The samples policies and AWS resource configuration below are for illustration purpose only and we recommend you follow your organization best practices for creating and securing AWS resources.
Step #1: Merchant resource creation
AWS Account Creation Please create an AWS account if the merchant doesnโt have one already. Example AWS Account ID : 012345678910 can be created using https://aws.amazon.com/resources/create-account/
Create AM role Please create an AWS IAM role and share it with Amazon JWO team. This role will be used to set up the access with the JWO AWS account.
Link to create an IAM Role: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create.html
IAM role Format: {Merchant_Name}_AMZN_JWOS_Role The generated Role ARN for the above IAM role would be as follows:
arn:aws:iam::012345678910:role/{MERCHANT_NAME}_AMZN_JWOS_Role
where {MERCHANT_NAME} is a placeholder to be replaced with the merchantโs name. As an example, Merchant Coffee Company will be setup as
arn:aws:iam::012345678910:role/COFFEE_COMPANY_AMZN_JWOS_Role
Step #2: Amazon merchant onboarding
In this step, Amazon JWO team will onboard the merchant using the AWS account id and the IAM role provided in steps 1 above. Once completed, Amazon JWO team will provide the following information to the merchant to complete next steps on the merchantโs end.
SNS topic ARN for the merchant to subscribe to. Make sure that you provide the SendMessage permissions on the SQS queue to the SNS topic provided.
KMS Key inline policy which is to be attached to merchantโs IAM role created in next step
A sample KMS inline policy will look like as below
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowUseOfKeyInAmazon3POrderAccount",
"Effect": "Allow",
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:GenerateDataKey*"
],
"Resource": [
"merchant-specific-psbi-kms-arn_1",
"merchant-specific-psbi-kms-arn_2"
]
}
]
}
S3 bucket(s) details where the files are published. Merchantsโ IAM role should also have permissions to get files /objects from the S3.Sample IAM policy for S3 permissions.
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:ListBucket",
"s3:GetObjectTagging"
],
"Resource": [
"arn:aws:s3:::<AMAZON_S3-BUCKET-NAME_1>",
"arn:aws:s3:::<AMAZON_S3-BUCKET-NAME_1>/*",
"arn:aws:s3:::<AMAZON_S3-BUCKET-NAME_2>",
"arn:aws:s3:::<AMAZON_S3-BUCKET-NAME_2>/*"
],
"Effect": "Allow",
"Sid": "AllowJwoRoleToAccessJWObucket"
}
]
}
Step #3: Merchant JWO setup
Using information provided by the JWO team. Merchant will attach the above KMS key policy and S3 policy to access the S3 service to the IAM role created
Create SQS Queue For steps on how to create an SQS queue, see AWS documentation here
Set SQS policy A sample SQS Access policy that needs to be present to allow our SNS topic to send messages to your queue
{
"Sid": "AllowSNSTopictoSendSQSMessage",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "SQS:SendMessage",
"Resource": "{YOUR_SQS_ARN}",
"Condition": {
"ArnLike": {
"aws:SourceArn": [
"SNS_TOPIC_PROVIDED_BY_AMAZON_1",
"SNS_TOPIC_PROVIDED_BY_AMAZON_2"
]
}
}
}
Create target S3 bucket Create an S3 bucket to act as a destination for the file received from Amazon. For steps on how to create an S3 bucket see AWS steps here
Create IAM policy for the destination bucket Below is an exmaple of IAM policy
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:DeleteObject",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:ListBucket",
"s3:GetObjectTagging"
],
"Resource": [
"arn:aws:s3:::<S3-BUCKET-NAME>",
"arn:aws:s3:::<S3-BUCKET-NAME>/*"
],
"Effect": "Allow",
"Sid": "AllowJWOroleToAccessTargetbucket"
}
]
}
Create lambda function to process new report notification Create an AWS lambda function to process the incoming SQS message and copy the Amazon report file. For steps to create a lambda function please see AWS docs here
Set SQS to lambda integration To setup SQS to lambda function integration, please follow AWS documentation steps here
Work with Amazon to get and accept the SNS to SQS integration To complete the end-to-end integration you will need to accept a subscription request from the Amazon SNS to the your SQS. You can follow step 2 in AWS documentation here
Code snippets/resources: For sample implementation and code samples please reach out to your Amazon team contact
File processing considerations
De-duplication: When processing data files, it's crucial to implement a robust mechanism to detect duplicate file data, as you may occasionally receive and process the same file multiple times. To prevent this, your application should incorporate duplicate file detection to prevent data redundancy and ensure accurate analytics.
Data flow patterns: When processing order data that is generated every 15 minutes, it's essential to implement robust logic in your application to handle situations where a single order's information may be split across multiple files. Your processing system should incorporate an upsert (update/insert) mechanism that can intelligently manage these scenarios.
When receiving a new file, the application should first check if the order already exists in your database. If it does, the system should update the existing record with any new or modified information from the current file. If the order doesn't exist, it should create a new record. This approach ensures data consistency and prevents duplicate entries while maintaining the most current order information.
Efficient file management: To optimize file management and resource utilization, the solution implements an efficient mechanism for handling new report files. Instead of relying on scheduled scans of the Amazon S3 bucket, which can be resource-intensive and potentially slow, the system leverages SNS (Simple Notification Service) messages. When a new report file is available, an SNS message is generated containing the file's metadata. The solution extracts the filename from this message and uses it to directly download the specific report file to the local environment.