I just to make sure : the exiat data before the st...
# dev-metaflow
s
I just to make sure : the exiat data before the streaming that we use for sample is typically in DynamoDB ( in biq volume) the new data the events is on DynamoDB (in big volume ) or on rds for small volume . I read that the data is streamed and batch processed into an Apache Hadoop datastore, through an Apache Hive layer. whats better? thanks in advance
1
v
storing streaming data in dynamodb/rds and then exporting it to (parquet) snapshots with metadata in hive for batch processing is a tried and true pattern
what do you have in mind for “what’s better”?
s
THANKS
🙌 1