In this course, you should have learned much about big data management. This assignment is to test

your acquired knowledge, critical thinking as well as knowledge self-discovery in big data management.

Some questions are of summary type with critical reflections nature; others are of self-discovery and

exploratory nature. Marking criteria include depth of acquired knowledge (30%), depth of reflective

analysis (30%), discovery ability in knowledge exploration (30%) and organization of writing (10%). You

have to answer all questions.

1) Developing Big Data Applications [20 marks]

a. Summarize and reflect what you have learned from this course about conceptual modeling in

big data management. Illustrate, with diagrams and explanation, the use of conceptual

modeling in your group project (do not share with your groupmates); (at least 600 words)

b. Search from external sources for more detail information about ECL which is used inside HPCC

environment. Describe ECL’s declarative programming constructs especially how its AI features

can solve BDA problems with task-parallelism nature; (at least 600 words)

2) Data & Task Parallelism [25 marks]

a. Summarize and reflect what you have learned from this course about data parallelism and task

parallelism; (at least 300 words)

b. Design a full example (case scenario requiring “composite-key; multiple-value” key-value pair,

e.g. {city + day, max temp + min temp}; 30 sets of raw data to be split into 2 data nodes;

MapReduce 5-stage solutions) illustrating your understanding of data parallelism. This example’s

solution format should be similar to class exercises 3 or 4 but the scenario and raw data must be

totally different;

3) Tools and techniques for big data [20 marks]

a. Using a table, compare and contrast between MPP and SMP in terms of their strengths,

weaknesses and application areas; (at least 500 words)

b. Find another big data platform (except HPCC and Spark) similar to Hadoop. Compare (using a

table) this platform with Hadoop in terms of features, strengths and weaknesses. (at least 500


4) Data governance for big data management [15 marks]

a. Summarize and reflect what you have learned from this course about the five key concepts for

big data oversight; (at least 600 words)

b. Illustrate your understanding of dimensions for measuring the quality of information used for

BDA with one example for each dimension; (at least 600 words)

5) NoSQL Database [20 marks]

a. Using a table, compare and contrast (in terms of structure, advantages, disadvantages, and

application areas) the four types of NoSQL database which you have learned in this course; (at

least 800 words)

b. Find one workable column-oriented NoSQL database product in the IT vendors’ market, and

describe it including features and price. Critically analyze why and how this NoSQL database can

be used inside the case of “What does Big Data have to do with an owl?” (at least 600 words)

Type of service: Academic paper writing
Type of assignment: Team paper
Subject: IT & Technology
Pages/words: 15/4100
Number of Sources: 0
Academic level: Junior (college 3rd year)
Paper Format: Harvard
Line spacing: Double
Language style: US English

