PDA

View Full Version : Help with normalization (Again)



Mystic Muse
2014-03-18, 08:24 PM
I need help with a normalization problem again. This time, I'm asking for help well before the deadline (Sunday at 11:55 PM Eastern time) rather than the day of the deadline.

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capturewhatever_zps9fe14e98.png (http://s1206.photobucket.com/user/Zernobog888/media/Capturewhatever_zps9fe14e98.png.html)

As before, I just need help. This is worth HALF the points on this exam thingy, so the more help I can get, the better. I'm working on the other stuff I can do in the meantime.

Just please help. I can't do another bad semester.

TaiLiu
2014-03-18, 10:04 PM
I'd like to assist, but I have no idea what to do. What exactly are you asking assistance for?

Ashtar
2014-03-19, 06:03 AM
The 3NF (http://en.wikipedia.org/wiki/Third_normal_form) requires attibutes to be in 2NF and have a dependency only on the key.

Have a read of Database Normalization (http://en.wikipedia.org/wiki/Database_normalization) again. I'm at work right now, so I can't write out all the steps. I'll try and do before 19h00 GMT+1.

Very quickly, I can see a table students, a table instructors and a table courses, a table student-course-enrolment-with-grade, a table courses-by-instructor.

Shortly:
1NF -> Has a key, all attributes are atomic (no multivalued fields)
2NF ->1NF and all the functional dependencies are elementary (split out repeated items in the table)
3NF -> 2NF and all the functional dependencies are direct (split out what has a direct dependency on one another to avoid repeats).
BCNF -> 3NF and All non key attributes of a relation have no functional dependency towards the key, reinforces the 3NF.

Chen
2014-03-19, 07:54 AM
Uh if this is an exam, isn't getting other people to do your work for you cheating?

Teddy
2014-03-19, 08:53 AM
It would help us a lot if you explained to us what you're having trouble with, because, as Chen said, we're not here to do your assignments for you. How much can you do on your own, and what concepts do you have trouble understanding? http://i.imgur.com/jLvmsd9.png

Also, I'm dropping my lecturer's slides on normalisation here (http://prezi.com/zhlm7nk94nz8/?utm_campaign=share&utm_medium=copy&rc=ex0share). If you understand the concept of functional dependencies, you can just double-click on the last square bracket box (the one with the yellow blob in it). They're listed as public and he has nowhere asked of us not to share them, so I think it should be okay...

EDIT:

BCNF -> 3NF and All non key attributes of a relation have no functional dependency towards the key, reinforces the 3NF.

Actually, BCNF only requires 1NF, because the criterion of 2NF and 3NF are fulfilled by always requiring a candidate key to be part of the determinant of every relationship.

Mystic Muse
2014-03-19, 10:18 PM
Uh if this is an exam, isn't getting other people to do your work for you cheating?

I'm not trying to get other people to do the work for me. That would be cheating, and not something I'd want to do. I've done the other 3 parts of it myself.



It would help us a lot if you explained to us what you're having trouble with, because, as Chen said, we're not here to do your assignments for you. How much can you do on your own, and what concepts do you have trouble understanding?

I'm not sure if the Student number field is allowed to repeat or not, and I think that's what's ultimately throwing me off. I've been reading up on normalization to try and get it clarified, but am having no luck.

Teddy
2014-03-20, 02:20 AM
I'm not sure if the Student number field is allowed to repeat or not, and I think that's what's ultimately throwing me off. I've been reading up on normalization to try and get it clarified, but am having no luck.

Well, if Student Number is your key, then no, no it isn't, but make note here that there are no mentions whatsoever in this assignment on what the key actually is, meaning that for as long as no two rows are identical, you'll be fine (although you probably want Student Number alone to be your primary key in the end result). http://i.imgur.com/jLvmsd9.png

Now, if this would've been an actual database, then just changing the primary key on your whim wouldn't work, but this is only a model, and one doesn't normalise a database by going through the steps on the actual tables, but rather by changing the database to the fully normalised schema you get from your models.

ChristianSt
2014-03-20, 01:09 PM
I'm not sure if the Student number field is allowed to repeat or not, and I think that's what's ultimately throwing me off. I've been reading up on normalization to try and get it clarified, but am having no luck.

The different normal forms doesn't really care whether attributes repeat or not. (Though 1NF doesn't allow repeating rows/tuples.) It matters only indirect because repeating attributes signalizes that something cannot be a key, which is relevant for some normal forms.

Whether or not student number is allowed to repeat or not depends on what the table represents/the underlying model. In a 1NF version of the given table it would be ok. (This table has no single attribute primary key. It would need a combination of multiple attributes.)

If you have a student table (or another table), then it depends on the underlying model. Since there is no information it can't really be told, but normally a student number should imo be unique per student.

[Also I'm not sure if you have seen all answers, but the information in the last thread on this topic (http://www.giantitp.com/forums/showthread.php?t=334547) is also still valid :smallwink:]

JeenLeen
2014-03-31, 11:01 AM
I need help with a normalization problem again. This time, I'm asking for help well before the deadline (Sunday at 11:55 PM Eastern time) rather than the day of the deadline.

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capturewhatever_zps9fe14e98.png (http://s1206.photobucket.com/user/Zernobog888/media/Capturewhatever_zps9fe14e98.png.html)

As before, I just need help. This is worth HALF the points on this exam thingy, so the more help I can get, the better. I'm working on the other stuff I can do in the meantime.

Just please help. I can't do another bad semester.

I get the normal forms confused. If 1NF is when everything is most divided into separate tables (poor phrasing, I know), then I would think:

Students Table:
StudentNumber (Primary Key), StudentName, Major

Teacher Table:
InstructorID (Primary Key), InstructorName, LocationInstructor,

Class Table:
CourseNumber (Primary Key), CourseName, InstructorID (Foreign Key)

Grades Table:
ID (Primary Key), StudentNumber (ForeignKey), CourseNumber (Foreign Key), Grade

I'm not sure if adding those ID fields as primary keys is okay or not. I guess you could use the InstructorName as a primary key, but there could be two with the same name. I'm also not sure if Grades fits it, since that has so many foreign keys... but this would let you pretty easily regenerate the source table you have with a query pulling from each table.

Again, I am NOT sure if this is 1NF or not, but maybe the above will help. (And I have no real idea how to show one's work. I thought through it in the way listed above, but it was just looking at the table and figuring out what could link to what, and then I had everything but Grades figured out, so I created a grades table.)

EDIT: I misread the question and only saw the part about 1st Normal Form. Still, hope this helps.
EDIT 2: With the Grades table, if you're allowed to use two values as a primary key, I could see the conjunction of StudentNumber and CourseNumber acting as a primary key (even though each are private keys)... assuming a student never takes the same course twice. Eh; guess that's not good advice, but I'll still throw this edit out there.

Mystic Muse
2014-04-01, 10:03 PM
Okay, I'm trying to fix what it is I'm doing wrong, but I'm completely lost. :smallfrown:


http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps7a45f349.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps7a45f349.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps13d102c7.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps13d102c7.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zpsaa35cab8.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zpsaa35cab8.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps9cf92b30.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps9cf92b30.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps245538d3.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps245538d3.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zpse14fe864.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zpse14fe864.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps4a947c8a.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps4a947c8a.png.html)

Normalization Test




For each of the following entities, indicate the binary relationship: (10 points)

Store and Pet: The store is presumably a store like Pets Mart that sells pets.

Customer and Order: A customer has a related order, such as a receipt for groceries/

Physician and Patient: A physician has a patient.


Employee and Parking Place: An employee has a corresponding Parking Place

Movie and Video Tape: A movie has a related VHS tape.



What are the steps for Normalization? (30 points)


Make sure there are no duplicated rows in the table.
Make sure each cell is single-valued (There are no repeating groups or arrays).
Entries in a column are of the same kind.

This brings a table to first normal form.

A table is in second normal form if it is in first normal form and if all non-key attributes are dependent on all of the key.

A table is in third normal form if it is in second normal form and if it has no transitive dependencies.

A table is in Boyce-codd Normal Form if it is in third normal form and every determinant is a candidate key.

A table is in fourth normal form if it is in Boyce-Codd Normal form and it has no multi-valued dependencies.

A table is in fifth normal form if it is in fourth normal form and every join dependency in the table is a consequence of the candidate keys of the table.

Finally, a table is in Domain-key Normal Form if every constraint on the table is a logical consequence of the definition of keys and domains.








DATABASE NORMALIZATION
Put the following table into Third Normal Form showing each step. Please show table in 1st normal form (50 points).

First, we list the Primary Keys and the related attributes

ŽStudent Number, Student Name, Major, Course Number, Course Name, instructor name, Instructor Location, Grade

Next we identify dependencies


ŽStudent Number, Student Name, Major, Course Number, Course Name, Grade, Instructor name, Instructor Location,

After that, we remove partial dependencies

Student Table (Student Number, Student Name, Major)
Course Table (Course Number, Course Name, Grade)
Instructor Table (Instructor Name, Instructor location)

The next step is to remove transitive dependency

To do this, I move Instructor, Course Number, and Course Name into a fourth table.

The next step is to connect the tables. The way I would do this is put Student Number in the other tables.


Student Table (Student Number, Student Name, Major)
Course Table (Course Number, Course Name, Grade, Student Number)
Instructor Table (Instructor Name, Instructor location, Student Number)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps3ce8e38b.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps3ce8e38b.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps3b53b761.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps3b53b761.png.html)

I'm not looking for people to do things for me. I'm just looking for help so I don't bomb another semester. :smallfrown:

ChristianSt
2014-04-02, 06:30 AM
Okay, I'm trying to fix what it is I'm doing wrong, but I'm completely lost. :smallfrown:


http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps7a45f349.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps7a45f349.png.html)
The cardinalities of the ER Diagram aren't ported correctly to the table. From the tables right now each costumer can have only one (and exactly one) order, but each order can be shared by multiple costumers. (If costumer_id is the primary key of the costumer table)



http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps13d102c7.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps13d102c7.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zpsaa35cab8.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zpsaa35cab8.png.html)
You should also add a "Costumer table", since you store redundant costumer information. (And I wouldn't call it Order table, since it only stores posts and not complete orders)



http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps9cf92b30.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps9cf92b30.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps245538d3.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps245538d3.png.html)
Seems fine to me



http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zpse14fe864.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zpse14fe864.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps4a947c8a.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps4a947c8a.png.html)
Having a "price table" with a primary key "product" (at least that seems to be the intent of the table) is just wrong. I don't see any reason to split the product information between multiple table (although someone could make arguments for split some of "Markup" <-> "cost" <-> "Price" in a new table, since that information seems redundant). But you should add a Supplier table (and imo also split the attribute supplier, right now it doesn't look atomic)



Normalization Test





Store and Pet: The store is presumably a store like Pets Mart that sells pets.

Customer and Order: A customer has a related order, such as a receipt for groceries/

Physician and Patient: A physician has a patient.


Employee and Parking Place: An employee has a corresponding Parking Place

Movie and Video Tape: A movie has a related VHS tape.




Make sure there are no duplicated rows in the table.
Make sure each cell is single-valued (There are no repeating groups or arrays).
Entries in a column are of the same kind.

This brings a table to first normal form.

A table is in second normal form if it is in first normal form and if all non-key attributes are dependent on all of the key.

A table is in third normal form if it is in second normal form and if it has no transitive dependencies.

A table is in Boyce-codd Normal Form if it is in third normal form and every determinant is a candidate key.

A table is in fourth normal form if it is in Boyce-Codd Normal form and it has no multi-valued dependencies.

A table is in fifth normal form if it is in fourth normal form and every join dependency in the table is a consequence of the candidate keys of the table.

Finally, a table is in Domain-key Normal Form if every constraint on the table is a logical consequence of the definition of keys and domains.








First, we list the Primary Keys and the related attributes

ŽStudent Number, Student Name, Major, Course Number, Course Name, instructor name, Instructor Location, Grade

Next we identify dependencies


ŽStudent Number, Student Name, Major, Course Number, Course Name, Grade, Instructor name, Instructor Location,

After that, we remove partial dependencies

Student Table (Student Number, Student Name, Major)
Course Table (Course Number, Course Name, Grade)
Instructor Table (Instructor Name, Instructor location)

The next step is to remove transitive dependency

To do this, I move Instructor, Course Number, and Course Name into a fourth table.

The next step is to connect the tables. The way I would do this is put Student Number in the other tables.


Student Table (Student Number, Student Name, Major)
Course Table (Course Number, Course Name, Grade, Student Number)
Instructor Table (Instructor Name, Instructor location, Student Number)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps3ce8e38b.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps3ce8e38b.png.html)

http://i1206.photobucket.com/albums/bb443/Zernobog888/Capture_zps3b53b761.png (http://s1206.photobucket.com/user/Zernobog888/media/Capture_zps3b53b761.png.html)

I'm not looking for people to do things for me. I'm just looking for help so I don't bomb another semester. :smallfrown:
The "Course table" is not really a "course table". If you have something and call it a "X table", then each row should feature one (and only one) X and each X should be in one (and one one) row. This ensures that there is no redundant data (like multiple appearances of a single course name). Also the course number doesn't seem do identify a course but a course/student pair (which is needed for a grade relation existing between those entities).

Or I'm not entirely correct in how those entities interact. From the instructors in the last table it is implied that there are in fact different courses (but imo that doesn't make sense, since it implies that each course can only have one student, which sounds stupid).

So without further information I can't really say more on the "Normalization Test".



If you are clearly lost, I think it is more helpful if you pinpoint concrete small topics and say what are your problems. Just dumping a bunch of questions and hope that you will find some useful stuff in the provided answers doesn't look like it would be productive (and it seems that most on the topic had already be said by someone in this or the past thread). I certainly don't mind helping out, but I don't like the feeling to answer the same thing multiple times without seeing any signs of improvement/understanding (though that may also be that we don't understand where exactly your problems are located).

Another solution would probably to grab some textbooks and start from the beginning (and if some stuff is unclear along the way, ask about that specific problem you encountered).

Mystic Muse
2014-04-04, 01:25 AM
The cardinalities of the ER Diagram aren't ported correctly to the table. From the tables right now each costumer can have only one (and exactly one) order, but each order can be shared by multiple costumers. (If costumer_id is the primary key of the costumer table)

...Whoops. Okay, that was a stupid mistake by me.

So, have two tables, one with

Order ID, Order, Order Date, and Customer ID (Order table)

and

Customer ID, Customer Name, and Customer Address (Customer table)

Instead, since that way both tables have primary keys that serve their intended function and neither table has unnecessary data?




You should also add a "Costumer table", since you store redundant costumer information. (And I wouldn't call it Order table, since it only stores posts and not complete orders)

Thanks.

Looking at what the instructor said, it sounds like I'm supposed to add an attribute to one of the tables. Presumably, that'd be something that can properly function as a primary key in the second table with the not complete orders?




Having a "price table" with a primary key "product" (at least that seems to be the intent of the table) is just wrong. Okay, I have no idea what my reasoning was for that or why on Earth that would have happened. :smallconfused:



I don't see any reason to split the product information between multiple table (although someone could make arguments for split some of "Markup" <-> "cost" <-> "Price" in a new table, since that information seems redundant). But you should add a Supplier table (and imo also split the attribute supplier, right now it doesn't look atomic)

Thank you.



The "Course table" is not really a "course table". If you have something and call it a "X table", then each row should feature one (and only one) X and each X should be in one (and one one) row. This ensures that there is no redundant data (like multiple appearances of a single course name). Also the course number doesn't seem do identify a course but a course/student pair (which is needed for a grade relation existing between those entities).

Or I'm not entirely correct in how those entities interact. From the instructors in the last table it is implied that there are in fact different courses (but imo that doesn't make sense, since it implies that each course can only have one student, which sounds stupid).

So without further information I can't really say more on the "Normalization Test". Thank you for the help. I also noticed an error in the "Step-by-step" part I was supposed to do.




If you are clearly lost, I think it is more helpful if you pinpoint concrete small topics and say what are your problems. Just dumping a bunch of questions and hope that you will find some useful stuff in the provided answers doesn't look like it would be productive (and it seems that most on the topic had already be said by someone in this or the past thread). I certainly don't mind helping out, but I don't like the feeling to answer the same thing multiple times without seeing any signs of improvement/understanding (though that may also be that we don't understand where exactly your problems are located).

Sorry about that. Honestly, I'm just a little overwhelmed with my whole workload this semester and was hoping for answers to specific problems I was having.

ChristianSt
2014-04-04, 04:57 AM
Sorry about that. Honestly, I'm just a little overwhelmed with my whole workload this semester and was hoping for answers to specific problems I was having.
No problem, I don't think asking questions (even a lot of questions, though I think it makes it unlikelier for people to replay) is bad. Asking follow up questions isn't bad either. So feel free to ask some more stuff. :smallwink:

But from the last thread and this I had the feeling that the questions/answers featured too much repetition. (Which can mean a multiple of things. It can for example just mean that there is some sort of communication problem (or that the answers isn't good). But in all cases I think just repeating the question is not helpful.) And if I get the feeling that I tried to give a good answer and it doesn't brought any reaction from you, it is just more likely that I will simply stop answering your questions, which doesn't help you, either.



Looking at what the instructor said, it sounds like I'm supposed to add an attribute to one of the tables. Presumably, that'd be something that can properly function as a primary key in the second table with the not complete orders?

It is likely you/your instructor means a surrogate key (http://en.wikipedia.org/wiki/Surrogate_key) (basically just a unique number, sometimes just incremented by 1 from a starting number).
If the primary key is unwieldy somehow (e.g. multiple attributes, long/complex attributes, often changing) then most people don't want to deal with the original primary key, for example because it is needed as a foreign key in other relations. With the current "partial order table", if you have another relation that needs a foreign key to the partial order table, it basically needs to repeat the complete row. (Because right now the primary key of "the partial order table" is the set of all attributes. Though maybe Part+Invoice would be enough if a order containing "1 Screw" and "2 Screw" is not valid and will be only stored as "3 Screw").
So it is quite popular to use a surrogate key instead of the original primary key, to not deal with certain possible problems (but it also enables some other sort of problems, like needing an additional attribute or requiring to not miss the fact that the original key is still a key, thus still requiring certain restrictions like unique not null. Though most of the times the benefits vastly outclass the drawbacks.)

But there is no hard rule when to use a surrogate key or not, and it doesn't really have that much to do with normalization.

Slipperychicken
2014-04-09, 10:08 AM
My professor said that when in doubt, use autonumber for a primary key, to avoid headaches resulting from composite keys (unless you're doing a many-to-many or something). Even with just Access, it's not going to be a problem unless you have like 2 billion autonumbered entries, in which case you might want to switch to a bigger program.

Also, you want your primary keys to be unique identifiers which won't change, so that the software can tell one row from the others. This is why a lot of large organizations will generate synthetic keys to track people with.

No idea why he's making you show each normal form, though. It seems simple enough to go straight to 3rd if you know what your doing (or if you can just redesign the database from scratch).