PDA

View Full Version : Sorted sets



Razanir
2013-11-15, 10:42 PM
Let's say you have a Java class


public class Product{
private String name;
private double price;

// constructors, getters, and setters
}

Does it make logical sense to have a SortedSet that orders by price, but checks equality by name? Or do those both have to operate on the same field(s)?

Keris
2013-11-16, 07:58 AM
The SortedSet interface stipulates that any comparator or compareTo method must be consistent with equals, i.e. e1.compareTo(e2) == 0 (or c.compare(e1, e2) == 0 if you're using a comparator) has the same boolean value as e1.equals(e2) for all e1 and e2 in the set.

This is because Java's SortedSet implementations use compare or compareTo for all comparisons; a SortedSet that uses a comparator that was not consistent with equals would treat some unequal elements as equal elements (and vice versa) and so wouldn't obey the contract of the Set interface.

However, I couldn't see anything that would prevent you from writing a new SortedSet implementation that orders its elements by a compareTo while also checking for equality with equals. It might just not be as efficient as the library implementations. :smallwink:

Razanir
2013-11-16, 01:26 PM
Well this is (sort of) for an assignment, so I can't use existing implementations anyway. I was just wondering if it was a valid design choice for SortedSet implementations.

TuggyNE
2013-11-16, 05:58 PM
Well this is (sort of) for an assignment, so I can't use existing implementations anyway. I was just wondering if it was a valid design choice for SortedSet implementations.

If it breaks the interface contract, chances are it's not.

Not too familiar with Java, but it sounds like it's probably designed with that assumption, so changing it could cause subtle problems in weird places.

Razanir
2013-11-16, 07:31 PM
If it breaks the interface contract, chances are it's not.

Not too familiar with Java, but it sounds like it's probably designed with that assumption, so changing it could cause subtle problems in weird places.

In which case...

Wee... Another problem with the Java collections library. :smallsigh:

My on-going list of grievances:
1) Why aren't Maps Collections?
2) There's no place in the hierarchy for a map that sorts by value.
3) Why do Stack (and similar classes) implement List?
4) And for that matter, why did they actually implement add(int index, Object o)? If you're going to have bad typing, at least throw exceptions for the methods that shouldn't be callable.

TuggyNE
2013-11-16, 09:31 PM
My on-going list of grievances:
1) Why aren't Maps Collections?
2) There's no place in the hierarchy for a map that sorts by value.
3) Why do Stack (and similar classes) implement List?
4) And for that matter, why did they actually implement add(int index, Object o)? If you're going to have bad typing, at least throw exceptions for the methods that shouldn't be callable.

I dislike Java for these kinds of rather odd things; their object system never made nearly as much sense to me as .NET's, and it seems to be the product of multiple redesigns that never quite covered all the previous aspects.

Teddy
2013-11-17, 08:56 AM
In which case...

Wee... Another problem with the Java collections library. :smallsigh:

Honestly, I fail to see why this is a problem with Java. I'd solve this problem by comparing first after price and second after name, meaning you'd have your set sorted after price, with wares sharing the same price sorted alphabetically, and no problems with queries or other equality-based operations...


My on-going list of grievances:
1) Why aren't Maps Collections?
2) There's no place in the hierarchy for a map that sorts by value.
3) Why do Stack (and similar classes) implement List?
4) And for that matter, why did they actually implement add(int index, Object o)? If you're going to have bad typing, at least throw exceptions for the methods that shouldn't be callable.

1) Because the Collection interface specify functions which aren't defined for maps, such as add(<E> e).

2) Think about it, when would such a map make sense? The point of a map is to index values with the help of keys, usually giving you logarithmic or constant time complexity to insert or find the value of a specific key, meaning that you have to sort it after the keys. If you sort it after value, you'd have linear time insertions/extractions, which defeats the point of the map. If you want a lower time complexity on retrieving the values in order rather than indexing by the keys, you should probably swap them, using the values as keys and keys as values, and use a map implementation which provides and iterator which guarantees in-order iteration. Or use a double-solution with two maps, one key-value, and one value-key.

3) Because Stack extends Vector, which implements List.

4) Implemented it in Stack, you mean? Because it was already implemented in Vector, presumably, and they didn't override it. Actually, I'm not really sure what you're talking about...

Razanir
2013-11-17, 10:26 AM
Honestly, I fail to see why this is a problem with Java. I'd solve this problem by comparing first after price and second after name, meaning you'd have your set sorted after price, with wares sharing the same price sorted alphabetically, and no problems with queries or other equality-based operations...

It's a bit more complicated. What I'm actually wondering about is a map sorted by values, not keys. :smallbiggrin:


1) Because the Collection interface specify functions which aren't defined for maps, such as add(<E> e).

But why are they in the interface in the first place? Wouldn't it make more sense to have a Collection interface that is JUST methods for logically and semantically grouping data? I.e. don't assume the user only wants to parameterize on the one value.


2) Think about it, when would such a map make sense? The point of a map is to index values with the help of keys, usually giving you logarithmic or constant time complexity to insert or find the value of a specific key, meaning that you have to sort it after the keys. If you sort it after value, you'd have linear time insertions/extractions, which defeats the point of the map. If you want a lower time complexity on retrieving the values in order rather than indexing by the keys, you should probably swap them, using the values as keys and keys as values, and use a map implementation which provides and iterator which guarantees in-order iteration. Or use a double-solution with two maps, one key-value, and one value-key.

:smallconfused: I thought the point of maps was to have associative arrays. Make it so that you don't have to index values with sequential integers.


3) Because Stack extends Vector, which implements List.

Okay... So why have it extend Vector, then? Why not have it implement Collection directly, and avoid all this mess around add-at-index


4) Implemented it in Stack, you mean? Because it was already implemented in Vector, presumably, and they didn't override it. Actually, I'm not really sure what you're talking about...

You can call add(int index, E element) on a Stack, and it will execute. If they really did have to make the design decision to extend Vector, why not override add(int index, E element) to throw an exception?

Teddy
2013-11-17, 11:21 AM
But why are they in the interface in the first place? Wouldn't it make more sense to have a Collection interface that is JUST methods for logically and semantically grouping data? I.e. don't assume the user only wants to parameterize on the one value.

Presumably because of an error of thought a good decade or two ago which now can't be rectified without breaking millions of Java programs. Java isn't exactly a new language...


:smallconfused: I thought the point of maps was to have associative arrays. Make it so that you don't have to index values with sequential integers.

Yes, exactly. What I kinda failed to communicate is that efficiently mapping the keys to indexes is a quite tricky subject, which is why we only have tree and hash implementations (plus one list-based I haven't explored), one which uses ordering to decide which element goes where based on an arbitrary root element, and the other which hashes the key to a integer which is sufficiently unlikely to overlap other hashings that they can be used with some few collision handling techniques.

Now, if you were to sort after values but index using keys, the tree implementation would just fail, and you'd have to walk the tree, finding your value in linear time (approaching logarithmic if you parallelise it). Hashing implementations lack order to start with, since no efficient hashing functions have been invented which can guarantee ordering, so that won't work either. All in all, there is no such thing as sorting by value but indexing by keys if you want better than linear extraction.

No, if you want to enjoy both less-than-linear time indexing and have it sorted by value, you need to keep a sorted value-key map (note the reverse order, values will be used as keys in this one, and vice versa) by the side, using the first map to index and the second one to retrieve them in sorted order.


Okay... So why have it extend Vector, then? Why not have it implement Collection directly, and avoid all this mess around add-at-index

Presumably for a fast implementation which won't have to allocate new memory every time you insert something...


You can call add(int index, E element) on a Stack, and it will execute. If they really did have to make the design decision to extend Vector, why not override add(int index, E element) to throw an exception?

Because someone was lazy and thought that no one would be so stupid as to use add as a stack operation...

nedz
2013-11-17, 02:08 PM
It's a bit more complicated. What I'm actually wondering about is a map sorted by values, not keys. :smallbiggrin:

A key is just how you access the data within the collection. If you access it by price, then that's your key.

Quite how you structure your data depends upon your algorithm, which in turn depends upon the problem you are trying to solve.

It should be possible to create two collections, holding the same common set of objects, one sorted by name, the other by price. Maintaining these collections can be quite a chore however.