A set
data type is defined as an "unordered collection of distinct hashable objects" according to the Python 3 documentation. You can use a set
for membership testing, removing duplicates from a sequence and computing mathematical operations, like intersection, union, difference, and symmetric difference.
Due to the fact that they are unordered collections, a set
does not record element position or order of insertion. Because of that, they also do not support indexing, slicing or other sequence-like behaviors that you have seen with lists and tuples.
There is two types of set
built-in to the Python language:
set
- which is mutablefrozenset
- which is immutable and hashableThis article will focus on set
.
You will learn how to do the following with sets:
set
set
membersset
Let's get started by creating a set!
Would you like to learn more about Python? Python 101 - 2nd Edition |
Creating a set
is pretty straight-forward. You can create them by adding a series of comma-separated objects inside of curly braces or you can pass a sequence to the built-in set()
function.
Let's look at an example:
>>> my_set = {"a", "b", "c", "c"} >>> my_set {'c', 'a', 'b'} >>> type(my_set) <class 'set'>
A set
uses the same curly braces that you used to create a dictionary. Note that instead of key: value
pairs, you have a series of values. When you print out the set
, you can see that duplicates were removed automatically.
Now let's try creating a set
using set()
:
>>> my_list = [1, 2, 3, 4] >>> my_set = set(my_list) >>> my_set {1, 2, 3, 4} >>> type(my_set) <class 'set'>
In this example, you created a list
and then cast it to a set
using set()
. If there had been any duplicates in the list
, they would have been removed.
Now let's move along and see some of the things that you can do with this data type.
You can check if an item is in a set
by using Python's is
operator:
>>> my_set = {"a", "b", "c", "c"} >>> "a" in my_set True
Sets do not allow you to use slicing or the like to access individual members of the set
. Instead, you need to iterate over a set
. You can do that using a loop, such as a while
loop or a for
loop.
You won't be covering loops until chapter 12, but here is the basic syntax for iterating over a collection using a for
loop:
>>> for item in my_set: ... print(item) ... c a b
This will loop over each item in the set
one at a time and print it out.
Once a set
is created, you cannot change any of its items.
However, you can add new items to a set
. Let's find out how!
There are two ways to add items to a set
:
add()
update()
Let's try adding an item using add()
:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.add('d') >>> my_set {'d', 'c', 'a', 'b'}
That was easy! You were able to add an item to the set
by passing it into the add()
method.
If you'd like to add multiple items all at once, then you should use update()
instead:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.update(['d', 'e', 'f']) >>> my_set {'a', 'c', 'd', 'e', 'b', 'f'}
Note that update()
will take any iterable you pass to it. So it could take, for example, a list
, tuple
or another set
.
You can remove items from sets in several different ways.
You can use:
remove()
discard()
pop()
Let's go over each of these in the following sub-sections!
The remove()
method will attempt to remove the specified item from a set
:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.remove('a') >>> my_set {'c', 'b'}
If you happen to ask the set
to remove()
an item that does not exist, you will receive an error:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.remove('f') Traceback (most recent call last): Python Shell, prompt 208, line 1 builtins.KeyError: 'f'
Now let's see how the closely related discard()
method works!
The discard()
method works in almost exactly the same way as remove()
in that it will remove the specified item from the set
:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.discard('b') >>> my_set {'c', 'a'}
The difference with discard()
though is that it won't throw an error if you try to remove an item that doesn't exist:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.discard('d') >>>
If you want to be able to catch an error when you attempt to remove an item that does not exist, use remove()
. If that doesn't matter to you, then discard()
might be a better choice.
The pop()
method will remove and return an arbitrary item from the set
:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.pop() 'c' >>> my_set {'a', 'b'}
If your set is empty and you try to pop()
and item out, you will receive an error:
>>> my_set = {"a"} >>> my_set.pop() 'a' >>> my_set.pop() Traceback (most recent call last): Python Shell, prompt 219, line 1 builtins.KeyError: 'pop from an empty set'
This is very similar to the way that pop()
works with the list
data type, except that with a list
, it will raise an IndexError
. Also lists are ordered while sets are not, so you can't be sure what you will be removing with pop()
since sets are not ordered.
Sometimes you will want to empty a set
or even completely remove it.
To empty a set
, you can use clear()
:
>>> my_set = {"a", "b", "c", "c"} >>> my_set.clear() >>> my_set set()
If you want to completely remove the set
, then you can use Python's del
built-in:
>>> my_set = {"a", "b", "c", "c"} >>> del my_set >>> my_set Traceback (most recent call last): Python Shell, prompt 227, line 1 builtins.NameError: name 'my_set' is not defined
Now let's learn what else you can do with sets!
Sets provide you with some common operations such as:
union()
- Combines two sets and returns a new setintersection()
- Returns a new set with the elements that are common between the two setsdifference()
- Returns a new set
with elements that are not in the other set
These operations are the most common ones that you will use when working with sets
.
The union()
method is actually kind of like the update()
method that you learned about earlier, in that it combines two or more sets together into a new set. However the difference is that it returns a new set rather than updating the original set with new items:
>>> first_set = {'one', 'two', 'three'} >>> second_set = {'orange', 'banana', 'peach'} >>> first_set.union(second_set) {'two', 'banana', 'three', 'peach', 'orange', 'one'} >>> first_set {'two', 'three', 'one'}
In this example, you create two sets. Then you use union()
on the first set to add the second set to it. However union
doesn't update the set
. It creates a new set
. If you want to save the new set
, then you should do the following instead:
>>> united_set = first_set.union(second_set) >>> united_set {'two', 'banana', 'three', 'peach', 'orange', 'one'}
The intersection()
method takes two sets and returns a new set
that contains only the items that are the same in both of the sets.
Let's look at an example:
>>> first_set = {'one', 'two', 'three'} >>> second_set = {'orange', 'banana', 'peach', 'one'} >>> first_set.intersection(second_set) {'one'}
These two sets have only one item in common: the string "one". So when you call intersection()
, it returns a new set
with a single element in it. As with union()
, if you want to save off this new set
, then you would want to do something like this:
>>> intersection = first_set.intersection(second_set) >>> intersection {'one'}
The difference()
method will return a new set with the elements in the set that are not in the other set. This can be a bit confusing, so let's look at a couple of examples:
>>> first_set = {'one', 'two', 'three'} >>> second_set = {'three', 'four', 'one'} >>> first_set.difference(second_set) {'two'} >>> second_set.difference(first_set) {'four'}
When you call difference()
on the first_set
, it returns a set
with "two" as its only element. This is because "two" is the only string not found in the second_set
. When you call difference()
on the second_set
, it will return "four" because "four" is not in the first_set
.
There are other methods that you can use with sets, but they are used pretty infrequently. You should go check the documentation for full details on set
methods should you need to use them.
Sets are a great data type that is used for pretty specific situations. You will find sets most useful for de-duplicating lists or tuples or by using them to find differences between multiple lists.
In this article, you learned about the following:
set
set
membersset
Any time you need to use a set-like operation, you should take a look at this data type. However, in all likelihood, you will be using lists, dictionaries, and tuples much more often.
Copyright © 2024 Mouse Vs Python | Powered by Pythonlibrary