Home IDS 410

Notes Index

3 Modeling Data in the Organization

Charles E. Oyibo


We discuss the entity-relationship (E-R) data model in this chapter.

Modeling the Rules of the Organization

Business rules are derived from policies, procedures, events, functions, and other business objects, and state constraints on the organization. Business rules are crucial to data modeling and providing names and definitions for entity types, attributes, and relationships.

The E-R model is used to construct a conceptual data model, which is a representation of the structure and constraints of a database that is independent of the software (such as the DBMS) and its associated data model that will be used to implement the database.

The E-R model was introduced in an article by Chen (1976) in which he described the main constructs of the E-R model: entities and relationships--and their associated attributes.

The job of a database designer is to:

  1. Identify and understand the (business) rules that govern data
  2. Represent those rules so that they can be unambiguously understood by information systems developers and users
  3. Implement those rules in database technology

A business rule is a statement that defines or constrains some aspect of the business. It is intended to assert business structure or to control or influence the behavior of the business... rules prevent, cause, or suggest things to happen.

A good business rule is...

Date Names and Definitions

Data names should:

A data definition (also called a structural assertion) is considered a type of business rule. A definition is an explanation of a term or a fact.

A term is a word or phrase that has a specific meaning for the business, and are often the key words used to form data names.

A fact is an association between two or more terms.

Good Data Definitions

Definitions (and all types of business rules) are gathered from the same sources as all requirements for information systems. Thus, systems and data analysts need to be looking for data objects and their definitions as these sources of information systems requirements are studied.

Definitions will usually supplement diagrams such as E-R diagrams, and should state such characteristics of a data object such as

The E-R Model

An entity-relationship (E-R) model is a detailed, logical representation of the data for an organization or for a business area.

The E-R model is expressed in terms of entities in the business environment, the relationships (or associations) among those entities, and the attributes (or properties) of both the entities and their relationships. The E-R model is usually expressed as an entity-relationship diagram.

It is critical to clearly and thoroughly define, as metadata, each entity in the E-R model; the accounting department might have a widely different definition of CUSTOMER than the marketing department.

An accurate and thorough E-R diagram without clear metadata may be interpreted in different ways by different people.

Note that each business rule (such as, "A SUPPLIER may supply many ITEMs) roughly follows a certain grammar:

<entity><minimum cardinality><relationship><maximum cardinality><entity>

as in

<CUSTOMER><may><submit><any number of><ORDERs>

E-R Relationship Model Constructs

Basic Symbols Relationship Degree Relationship Cardinality
  • Strong Entity
  • Weak Entity
  • Relationship
  • Identifying Relationship
  • Associative Entity
  • Attribute
  • Mutivalued Attribute
  • Derived Attribute
  • Unary
  • Binary
  • Ternary
  • Mandatory One
  • Mandatory Many
  • Optional One
  • Optional Many

Entities

An entity is a person, place, object, event, or concept in the user environment about which the organization wishes to maintain data.

Entity Type Versus Entity Instance

Entity Type: A collection of entities that share common properties or characteristics.

Entity Instance: A single occurrence of an entity type.

Entity Type Versus System Input, Output, or User

A simple rule to avoid confusion in distinguishing between data entities and other elements of an overall information systems model (such as input, output, or user) is that a true data entity will have many possible instances, each with a distinguishing characteristic, as well as one or more other descriptive pieces of data. If we consider the following entities and relationships then:

We note that TREASURER and EXPENSE REPORT are not entity types because:

Strong Versus Weak Entity

Strong Entity Type: An entity that exists independently of other entity types. Instances of a strong entity type always have a unique characteristic (called an identifier)--i.e., an attribute or combination of attributes that uniquely distinguish each occurrence of that entity.

Weak Entity Type: An entity types whose existence depends on some other entity type. A weak entity type does not have its own identifier (though on an E-R diagram, a weak entity type has an attribute that serves as a partial identifier, which is later combined with the identifier of its owner to form a full identifier).

Identifying Owner: The entity type on which the weak entity type depends.

Identifying Relationship: The relationship between a weak entity types and its owner.

Naming and Defining Entity Types

The following are guidelines for naming entity types:

The following are guidelines for defining entity types:

Attributes

Attribute: A property or characteristic of an entity types that is of interest to the organization. Represented in an E-R diagram with an ellipse with a line connecting it to its associated entity.

Naming Convention for Attributes: Initial capital letter followed by lower case letters; if two words, separate words with underscore and begin each word with capital letter.

Simple Versus Composite Attributes

Composite Attribute: An attribute that can be broken down into (meaningful) component parts.

Simple (or Atomic) Attribute: An attribute that cannot be broken down into smaller components.

Single-Valued Versus Multivalued Attributes

Single-Valued Attribute:

Multivalued Attribute: An attribute that may take on more than one value for a given entity instance. Represented with an ellipse with double lines in E-R diagrams.

Stored Versus Derived Attributes

Stored Attribute: An attribute whose value remains in the form in which it was entered into the database.

Derived Attribute: An attribute whose values can be calculated from related attribute values (plus possibly data not in the database). Represented in an E-R diagram using an ellipse with a dashed line.

Identifier Attribute

Identifier: An attribute (or combination of attributes) that uniquely identifies individual instances of an entity type. We underline identifier names in E-R diagrams.

Composite Identifier: An identifier that consists of a composite attribute. We use composite identifiers for some entity types because there is no single (atomic) attribute that can serve as the identifier (that is, will ensure uniqueness). The composite attributes are not underlined.

The following are criteria are suggested for selecting identifiers:

  1. choose an identifier that will not change its value over the life of each instance of the entity type
  2. choose an identifier such that for each instance of the entity, the attribute is guaranteed to have valid values and not be null (or unknown)
  3. avoid the use of so-called intelligent identifiers, whose structure indicates classifications, locations, etc.
  4. consider substituting single-attribute surrogate identifiers for large composite identifiers

Naming and Defining Attributes

Relationships

Intuitively, a relationship is an association among the instances of one or more entity types that is of interest to the organization. We use a diamond and a single verb phrase to represent a relationship.

Relationship Type: A meaningful association between (or among) entity types.

Relationship Instances: An association between (or among) entity instances where each relationship instance includes exactly one entity from each participating entity type.

Associative Entity: An entity type that associates the instances of one or more entity types and contains attributes that are peculiar to the relationship between those entity instance. (Associative entities are also referred to as Gerunds, as the relationship names are converted from a verb or verb phrase to a noun.)

The following four conditions should exist to convert a relationship to an associative entity type:

  1. all of the relationships for the participating entity types are "many" relationships
  2. the resulting associative entity type has independent meaning to end users, and preferably can be identified with a single-attribute identifier
  3. the associative entity has one or more attributes, in addition to the identifier
  4. the associative entity participates in one or more relationships independent of the entities related to the associated relationship

Converting a relationship to an associative entity type will generally cause the relationship cardinality to move.

Degree of a Relationship

The Degree of a Relationship: The number of entity types that participate in that relationship.

The following represent the three most common degrees of relationship. (Note that the models represent a specific situation, not a generalization):

Unary Relationships (Degree 1)

Unary Relationship: A relationships between instances of a single entity type. (Also referred to as Recursive Relationships)

Binary Relationships (Degree 2)

Binary Relationship: A relationship between the instances of two entity types.

Ternary Relationships (Degree 3)

Ternary Relationship: A simultaneous relationship among the instances of three entity types.

It is strongly recommended that all ternary (or higher) relationships be converted to associative entities. So, although not semantically accurate, we must represent the ternary relationship with an associative entity and three binary relationships (though a ternary relationship is not the same as three binary relationships).

Attribute or Relationship?

When should an attribute be linked to an entity type via a relationship? Answer: when the attribute is the identifier or some other characteristic of an entity type in the data model and multiple entity instances need to share these same attributes.

Cardinality Constraints

Cardinality Constraints: Specified the number of instance of one entity that can (or must) be associated with each instance of another entity.

Minimum Cardinality: The minimum number of instances of one entity that may be associated with each instance of another entity. Generally, if minimum cardinality is 0, cardinality is optional; if the minimum cardinality is 1, participation is mandatory.

Maximum Cardinality: The maximum number of instances of one entity that may be associated with each instance of another entity. It is possible for maximum cardinality to be a fixed number, not an arbitrary"many" value.

Modeling Time-Dependent Data

Time stamp: A time value (such as data and time) that is associated with a data value.

Time stamps may be recorded to indicate:

Multiple Relationships

We should not that in some situations organizations may wish to model more than one relationship between the same entity types. For example, we could have the follow relationships:

Naming and Defining Relationships

A relationship definition should:

E-R Modeling Example: Pine Valley Furniture Company. p. 112 Hoffer, et al.

Top of Page

Charles E. Oyibo
IDS :: CBA :: UIC