A **set** is a *sequential* abstract data type where following rules have to be obeyed:

- any entry must be unique

Although, as a abstract data type, set comes with no implementation, it requires from data structures that implement it to provide support for a minimal set of operations:

Operation |
Arguments |
Returns |
Description |

add | VALUE | void | Adds value to set. |

clear | void | Clears set of all values. | |

contains | VALUE | bool | Checks if value exists in set. |

find | VALUE | VALUE | Finds and returns value in set. |

remove | VALUE | void | Removes element by value. |

isEmpty | bool | Checks if set is empty | |

size | ULONG | Gets set size |

Most common implementations of set are:

- Hash Table
- Strengths:
- By far the fastest in insertion/lookup/deletion of entries: because hash table only needs to compute a hash of entry and look up in bucket (List at position @ List) that corresponds to hash to add/delete/retrieve that entry. Because bucket ideally contains one entry only, this operation is considered to occur in constant O(1) time.
- By far the smallest in memory consumption: only a List of List.

- Weaknesses:
- Unpredictable iteration, which makes it hard to test.
- It depends on:
- a fast hash function that guarantees even distribution throughout buckets, so that the List inside ideally contains only one entry.
- a reliable comparator function that guarantees there won't be duplicates (which would violate above-mentioned set principles)

- Strengths:
- Linked Hash Table
- Strengths:
- Almost as fast, just a bit slower (5-10%) than Hash Table on insertion/lookup/deletion of entries: because it needs to maintain the doubly linked list that preserves insertion order as well.
- Predictable iteration, which makes testing easy

- Weaknesses:
- Greater memory consumption (33%) than Hash Table, because it needs to store the doubly linked list that maintains insertion order.
- Same dependency on a good hash/comparator function.

- Strengths:
- Red Black Tree
- Strengths:
- Stays ordered.
- Predictable iteration, which makes testing easy

- Weaknesses:
- Much slower (6-7 times) than Hash Table in insertion/deletion of entries: because the red black tree always has to rebalance itself, which is a VERY costly operation that may involve the whole data set.
- Greater memory consumption (33%) than Hash Table, because the struct that wraps each value requires more information.
- Depends on a reliable comparator function.

- Strengths:

This time complexity table using Big O Notation shows advantages and disadvantages of each data structure chosen for each set operation:

Operation |
Hash Table | Linked Hash Table | Red Black Tree |

add | O(1) | O(1) | O(log(N)) |

clear | O(N) | O(N) | O(N) |

contains | O(1) | O(1) | O(log(N)) |

find | O(1) | O(1) | O(log(N)) |

remove | O(1) | O(1) | O(log(N)) |

isEmpty | O(1) | O(1) | O(1) |

size | O(1) | O(1) | O(1) |

**Conclusion**: Hash Table, with its speed/memory efficiency advantages, should be solution of choice unless one specifically desires structure to be predictably iterated or self-ordered!