Data modeling
From WhyNotWiki
Data modeling edit (Category edit)
Contents |
[edit] What it is
http://en.wikipedia.org/wiki/Data_model
A data model is an abstract model that describes how data is represented and used.The term data model has two generally accepted meanings:
- A data model theory i.e. a formal description of how data may be structured and used. See also database model
- A data model instance i.e. applying a data model theory to create a practical data model instance for some particular application. See data modeling.
[edit] Data Model Theory
A data model theory has three main components:
- The structural part: a collection of data structures which are used to create databases representing the entities or objects modeled by the database.
- The integrity part: a collection of rules governing the constraints placed on these data structures to ensure structural integrity.
- The manipulation part: a collection of operators which can be applied to the data structures, to update and query the data contained in the database.
For example, in the relational model, the structural part is based on a modified concept of the mathematical relation; the integrity part is expressed in first-order logic and the manipulation part is expressed using the relational algebra, tuple calculus and domain calculus.
[edit] Data Model Instance
Data modeling is the process of creating a data model instance by applying a data model theory. This is typically done to solve some business enterprise requirement.
Business requirements are normally captured by a semantic logical data model. This is transformed into a physical data model instance from which is generated a physical database. For more information on the tools and techniques of data modelling, see data modelling.
http://en.wikipedia.org/wiki/Logical_data_model
In computer science, a logical data model is a representation of an organization's data, organized in terms of a particular data management technology. [...] Now, the choices are relational, object-oriented, and XML. Relational data are described in terms of tables and columns. Object-oriented data are described in terms of classes, attributes, and associations. XML is described in terms of tags.
Logical data models, properly designed, should be based on the structures identified in the w:conceptual data model. This, after all described the semantics of the business, which the logical model should also reflect. Even so, since the logical data model anticipates implementation on a finite-capacity computer, some will modify the structure to achieve certain efficiencies.
http://en.wikipedia.org/wiki/Data_modeling
In computer science, data modeling is the process of creating a w:data model by applying a data model theory to create a data model instance. A data model theory is a formal data model description. See w:database model for a list of current data model theories.When data modelling, we are structuring and organizing data. These data structures are then typically implemented in a database management system. In addition to defining and organizing the data, data modeling will impose (implicitly or explicitly) constraints or limitations on the data placed within the structure.
Managing large quantities of structured and unstructured data is a primary function of information systems. Data models describe structured data for storage in data management systems such as relational databases. They typically do not describe unstructured data, such as word processing documents, email messages, pictures, digital audio, and video.
A data model instance may be one of three kinds (according to ANSI in 1975[1]):
- a w:conceptual schema (data model) describes the semantics of an organization. This consists of entity classes (representing things of significance to the organization) and relationships (assertions about associations between pairs of entity classes).
- a w:logical schema (data model) describes the semantics, as represented by a particular data manipulation technology. This consists of descriptions of tables and columns, object oriented classes, and XML tags, among other things.
- a w:physical schema (data model) describes the physical means by which data are stored. This is concerned with partitions, CPUs, tablespaces, and the like.
The significance of this approach, according to ANSI, is that it allows the three perspectives to be relatively independent of each other. Storage technology can change without affecting either the logical or the conceptual model. The table/column structure can change without (necessarily) affecting the conceptual model.
[edit] General about modeling (how to, best practices, ...)
http://en.wikipedia.org/wiki/Data_modeling
The entities represented by a data model can be the tangible entities, but models that include such concrete entity classes tend to change over time. Robust data models often identify abstractions of such entities. For example, a data model might include an entity class called "Person", representing all the people who interact with an organization. Such an abstract entity class is typically more appropriate than ones called "Vendor" or "Employee", which identify specific roles played by those people.
http://www.utexas.edu/its/windows/database/datamodeling/dm/design.html
The data model is one part of the conceptual design process. The other is the function model. The data model focuses on what data should be stored in the database while the function model deals with how the data is processed. To put this in the context of the relational database, the data model is used to design the relational tables. The functional model is used to design the queries that will access and perform operations on those tables.
Data modeling is preceeded by planning and analysis. The effort devoted to this stage is proportional to the scope of the database. The planning and analysis of a database intended to serve the needs of an enterprise will require more effort than one intended to serve a small workgroup.
The information needed to build a data model is gathered during the requirements analysis. Although not formally considered part of the data modeling stage by some methodologies, in reality the requirements analysis and the ER diagramming part of the data model are done at the same time.
...
The requirements analysis is usually done at the same time as the data modeling. As information is collected, data objects are identified and classified as either entities, attributes, or relationship; assigned names; and, defined using terms familiar to the end-users. The objects are then modeled and analysed using an ER diagram. [...]
Three points to keep in mind during the requirements analysis are:
- Talk to the end users about their data in "real-world" terms. Users do not think in terms of entities, attributes, and relationships but about the actual people, things, and activities they deal with daily.
- Take the time to learn the basics about the organization and its activities that you want to model. Having an understanding about the processes will make it easier to build the model.
- End-users typically think about and view data in different ways according to their function within an organization. Therefore, it is important to interview the largest number of people that time permits.
...
While ER model lists and defines the constructs required to build a data model, there is no standard process for doing so. Some methodologies, such as IDEFIX, specify a bottom-up development process were the model is built in stages. Typically, the entities and relationships are modeled first, followed by key attributes, and then the model is finished by adding non-key attributes. Other experts argue that in practice, using a phased approach is impractical because it requires too many meetings with the end-users. The sequence used for this document are:
- Identification of data objects and relationships
- Drafting the initial ER diagram with entities and relationships
- Refining the ER diagram
- Add key attributes to the diagram
- Adding non-key attributes
- Diagramming Generalization Hierarchies
- Validating the model through normalization
- Adding business and integrity rules to the Model
[edit] Modeling [conversation threads (category)]
[ActiveRecord (category)]:
ryanb at http://railsforum.com/viewtopic.php?pid=22803#p22803
class Message < ActiveRecord::Base belongs_to :author, :class_name => 'User', :foreign_key => 'author_id' belongs_to :recipient, :class_name => 'User', :foreign_key => 'recipient_id acts_as_tree endWhen creating a message you just have a form setting the content and recipient_id. The author_id can automatically be set to the current user.
When replying to a message you can set both the recipient_id and owner_id column automatically. You would also need to set the parent_id to the id of the original message. Sounds like it will work pretty well. If you need the "make public" option you could do this with a simple boolean column.
[edit] Schemas
[edit] Schema Mania
http://www.schemamania.org/schemamania/
http://www.schemamania.org/schemamania/.
Schema Mania was conceived as a repository of database designs. You'd be able to come here, browse for a database design in your "problem space". [...]
[edit] Tools
[edit] Using Dia
http://www.schemamania.org/schemamania/.
- First of all, there's Dia herself
- Dia can be used to draw many different kinds of diagrams. It currently has special objects to help draw entity relationship diagrams, UML diagrams, flowcharts, network diagrams, and simple circuits.
- ftp://az.water.usgs.gov/pub/ashalper/src/dia/plug-ins/ Dia SQL export plug-in
- Currently supports only UML class diagrams, and can generate only plain SQL CREATE TABLE statements.
- http://sourceforge.net/projects/dia2code/ dia2code
- Generates C++ and Java code from an UML Dia Diagram.
- http://www.droogs.org/autodial/ AutoDia
- Creates Dia UML Diagrams from source code. Does not read SQL; included because it demonstrates possibilities.
