Part A
Two marks
1. Define Data warehouse (or) what is Data Warehouse?
A data warehouse is a repository of
multiple heterogeneous data sources organized under a unified schema at a
single site to facilitate management decision making. (Or)
A data warehouse is a subject-oriented,
time-variant and non-volatile collection of data in support of management’s
decision-making process.
2. What are
operational databases?
Organizations
maintain large database that are updated by daily transactions are called
operational databases.
3. Define OLTP?
If
an on-line operational database systems is used for efficient retrieval,
efficient storage and management of large amounts of data, then the system is
said to be on-line transaction processing.
4. Define OLAP?
Data
warehouse systems serves users (or) knowledge workers in the role of data analysis
and decision-making. Such systems can organize and present data in various
formats. These systems are known as on-line analytical processing systems.
5. Write short
notes on multidimensional data model?
Data warehouses and OLTP tools are based
on a multidimensional data model. This model is used for the design of
corporate data warehouses and department data marts. This model contains a Star
schema, Snowflake schema and Fact constellation schemas. The core of the
multidimensional model is the data cube.
6. Define data
cube?
It consists of a large set of facts (or)
measures and a number of dimensions.
7. What are facts?
Facts are numerical measures. Facts can
also be considered as quantities by which we can analyze the relationship
between dimensions.
8. What are
dimensions?
Dimensions
are the entities (or) perspectives with respect to an organization for keeping
records and are hierarchical in nature.
9. Define dimension
table?
A
dimension table is used for describing the dimension. (e.g.) A dimension table
for item may contain the attributes item_ name, brand and type.
10.
Define fact table?
Fact
table contains the name of facts (or) measures as well as keys to each of the
related dimensional tables.
11.
What are lattice of cuboids?
In
data warehousing research literature, a cube can also be called as cuboids. For
different (or) set of dimensions, we can construct a lattice of cuboids, each
showing the data at different level. The lattice of cuboids is also referred to
as data cube.
12.
What are apex cuboids?
The
0-D cuboids which holds the highest level of summarization is called the apex cuboids.
The apex cuboids are typically denoted by all.
13.
List out the components of star schema?
A
large central table (fact table) containing the bulk of data with no
redundancy. A set of smaller attendant tables (dimension tables), one for each
dimension.
14.
What is snowflake schema?
The
snowflake schema is a variant of the star schema model, where some dimension
tables are normalized thereby further splitting the tables in to additional
tables.
15.
List out the components of fact constellation
schema?
This
requires multiple fact tables to share dimension tables. This kind of schema
can be viewed as a collection of stars and hence it is known as galaxy schema
(or) fact constellation schema.
16.
Point out the major difference between the star
schema and the snowflake schema?
The
dimension table of the snowflake schema model may be kept in normalized form to
reduce redundancies. Such a table is easy to maintain and saves storage space.
17.
Which is popular in the data warehouse design, star
schema model (or) snowflake schema model?
Star
schema model, because the snowflake structure can reduce the effectiveness and
more joins will be needed to execute a query.
18.
Define concept hierarchy?
A
concept hierarchy defines a sequence of mappings from a set of low-level
concepts to higher-level concepts.
19.
Define total order?
If
the attributes of a dimension which forms a concept hierarchy such as
“street<city< province_or_state <country”, then it is said to be total
order. Country Province or state City Street Fig: Partial order for location
20.
Define partial order?
If
the attributes of a dimension which forms a lattice such as
“day<{month<quarter; week}<year, then it is said to be partial order.
21.
Define schema hierarchy?
A
concept hierarchy that is a total (or) partial order among attributes in a
database schema is called a schema hierarchy.
22.
List out the OLAP operations in multidimensional
data model?
Roll-up
_ Drill-down _ Slice and dice _ Pivot (or) rotate
23.
What is roll-up operation?
The
roll-up operation is also called drill-up operation which performs aggregation
on a data cube either by climbing up a concept hierarchy for a dimension (or)
by dimension reduction.
24.
What is drill-down operation?
Drill-down
is the reverse of roll-up operation. It navigates from less detailed data to
more detailed data. Drill-down operation can be taken place by stepping down a
concept hierarchy for a dimension.
25.
What is slice operation?
The
slice operation performs a selection on one dimension of the cube resulting in
a sub cube.
26.
What is dice operation?
The
dice operation defines a sub cube by performing a selection on two (or) more
dimensions.
27.
What is pivot operation?
This
is a visualization operation that rotates the data axes in an alternative
presentation of the data.
28.
List out the views in the design of a data
warehouse?
Top-down
view _ Data source view _ Data warehouse view _ Business query view.
29.
What are the methods for developing large software
systems?
Waterfall
method _ Spiral method
30.
How the operation is performed in waterfall method?
The
waterfall method performs a structured and systematic analysis at each step
before proceeding to the next, which is like a waterfall falling from one step
to the next.
31.
How the operation is performed in spiral method?
The
spiral method involves the rapid generation of increasingly functional systems,
with short intervals between successive releases. This is considered as a good
choice for the data warehouse development especially for data marts, because
the turnaround time is short, modifications can be done quickly and new designs
and technologies can be adapted in a timely manner.
32.
List out the steps of the data warehouse design
process?
Choose
a business process to model.
Choose the grain of the business process
Choose the dimensions that will apply to each
fact table record.
Choose the measures that will populate each
fact table record.
33.
What is enterprise warehouse?
An
enterprise warehouse collects all the information’s about subjects spanning the
entire organization. It provides corporate-wide data integration, usually from
one (or) more operational systems (or) external information providers. It
contains detailed data as well as summarized data and can range in size from a
few giga bytes to hundreds of giga bytes, tera bytes (or) beyond.
34.
What is data mart?
Data
mart is a database that contains a subset of data present in a data warehouse.
Data marts are created to structure the data in a data warehouse according to
issues such as hardware platforms and access control strategies. We can divide
a data warehouse into data marts after the data warehouse has been created.
Data marts are usually implemented on low-cost departmental servers that are
UNIX (or) windows/NT based.
35.
What are dependent and independent data marts?
Dependent
data marts are sourced directly from enterprise data warehouses. Independent
data marts are data captured from one (or) more operational systems (or)
external information providers (or) data generated locally with in particular
department (or) geographic area.
36.
What is virtual warehouse?
A virtual warehouse is a set of views
over operational databases. For efficient query processing, only some of the
possible summary views may be materialized. A virtual warehouse is easy to
build but requires excess capability on operational database servers.
37.
Define indexing?
Indexing is a technique, which is used
for efficient data retrieval (or) accessing data in a faster manner. When a
table grows in volume, the indexes also increase in size requiring more
storage.
38.
What are the types of indexing?
B-Tree
indexing _ Bit map indexing _ Join indexing
39.
Define metadata?
Metadata
is used in data warehouse is used for describing data about data. (i.e.) Meta
data are the data that define warehouse objects. Metadata are created for the
data names and definitions of the given warehouse.
40.
Define VLDB?
Very
Large Data Base. If a database whose size is greater than 100GB, then the
database is said to be very large database.
PART
B
16
(OR) 8 MARKS
1. Discuss the components of data warehouse. (8)
_Subject-oriented
_Integrated
_Time-Variant
_Non-volatile
2. List out the differences between OLTP and OLAP. (8)
_ Users and system orientation
_ Data contents
_ Database design
_ View
_ Access patterns
3. Discuss the various
schematic representations in multidimensional model.
_ Star schema
_ Snow flake schema
_ Fact constellation schema
4. Explain the OLAP operations
I multidimensional model.
_ Roll-up
_ Drill-down
_ Slice and dice
_ Pivot or rotate
5. Explain the design and construction of a data warehouse.
_ Design of a data warehouse
• Top-down
view
• Data
source view
• Data
warehouse view
• Business
query view
_ Process of data warehouse design
6. Explain the three-tier data warehouse architecture.
_ Warehouse database server (Bottom tier)
_ OLAP server (middle tier)
_ Client (top tier)
7. Explain indexing.
_ Definition
_ B-Tree indexing
_ Bit-map indexing
_ Join indexing
8. Write notes on metadata
repository.
_ Definition
_ Structure of the data warehouse
_ Operational metadata
_ Algorithms used for summarization
_ Mapping from operational environment to data warehouse
_ Data related to system performance
_ Business metadata
No comments:
Post a Comment