aws-certified-data-analytics-specialty-das-c01 question 150 discussion

View all AWS Certified Data Analytics - Specialty here
back to amazon forum

Question 150

A company has 10-15 of uncompressed .csv files in Amazon S3. The company is evaluating Amazon Athena as a one-
time query engine. The company wants to transform the data to optimize query runtime and storage costs.
Which option for data format and compression meets these requirements?

  • A. CSV compressed with zip
  • B. JSON compressed with bzip2
  • C. Apache Parquet compressed with Snappy
  • D. Apache Avro compressed with LZO
Answer:

B


Explanation:
Reference: https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/

User Votes:
A
50%
B
50%
C 1 votes
50%
D
50%
Discussions
0 / 1000
kousik.cemk
12 months ago

As per the link For Athena, we recommend using either Apache Parquet or Apache ORC, which compress data by default and are splittable.