devarena logo
Reading Time: 4 minutes


Pandas Series is a one dimensional labelled array. It can hold any data type like int, float, string, python object, etc. The labels are also known as index.

Syntax:

s = pd.Series(data, index=index)

There are lots of ways to create a Pandas Series. I have mentioned some commonly used methods below. You can pick the one which suits your Python program.

  1. Create an empty series
  2. Create a series from python list – without index
  3. Create a series from python list – with index
  4. Create a series from numpy array – without index
  5. Crete a series from numpy array – with index
  6. Create a series from dictionary – without index
  7. Create a series from dictionary – with index
  8. Create a series from dictionary – additional values mentioned in index
  9. Create a series with a constant number
  10. Create a series where the index is a numpy array

1. Create an empty series

import numpy as np
import pandas as pd
# 1. Create an empty series
s1 = pd.Series(dtype="object")
s1

Output:

Series([], dtype: object)

You can also create an empty series using pd.Series() without mentioning the dtype. But you will get a warning. Because, if you create an empty series like this, pandas by default makes the data type as “float64”. In future versions, the default type is going to be “object”. So, it is better to explicitly mention the data type so that our code will be compatible with future versions.

Some common data types are float64, int64, object, datetime64, etc.

2. Create a series from python list – without index

# 2. Create a series from python list - without index
s2 = pd.Series([10, 11, 12, 13, 14])
s2
Output:
0    10
1    11
2    12
3    13
4    14
dtype: int64

Here, I have not set any index to the series. So by default it is indexed starting from 0.

3. Create a series from python list – with index

# 3. Create a series from python list - with index
s3 = pd.Series([10, 11, 12, 13, 14], index=["a", "b", "c", "d", "e"])
s3
Output:
a    10
b    11
c    12
d    13
e    14
dtype: int64

Here, the output is indexed as a, b, c, etc. using the list passed to the index parameter. Compare this with the previous section and check how the index differs. Like this, you can set index of your choice.

4. Create a series from numpy array – without index

# 4. Create a series from numpy array - without index
s4 = pd.Series(np.arange(3, 8))
s4
Output:
0    3
1    4
2    5
3    6
4    7
dtype: int32

Data passed to pd.Series function is a numpy array. I have not set index explicitly. So default indexing is used which starts from 0.

5. Create a series from numpy array – with index

# 5. Create a series from numpy array - with index
s5 = pd.Series(np.arange(3, 8), index=["pens", "pencils", "markers", "erasers", "notebook"])
s5
Output:
pens        3
pencils     4
markers     5
erasers     6
notebook    7
dtype: int32

6. Create a series from dictionary – without index

Example – 1:

This example uses a dictionary with one element per label. Compare the “basketball” label in this example 1 and the next example 2.

The ordering of the output series depends on insertion order. The output ordering is same as dict_enrollment ordering of labels. I am using Python 3.7.6 and pandas 1.0.1. In case you are using Python < 3.6 or pandas < 0.23, you will get the output in lexical order of the dict labels.

dict_enrollment = {"basketball": 21, "soccer": 15, "chess": 10, "art": 13}
s6 = pd.Series(dict_enrollment)
s6
Output:
basketball    21
soccer        15
chess         10
art           13
dtype: int64

Example – 2:

This example uses a dictionary, and one element is a list. Note the “basketball” label.

dict_enrollment2 = {"basketball": [21, 22], "soccer": 15, "chess": 10, "art": 13}
s6 = pd.Series(dict_enrollment2)
s6
Output:
basketball    [21, 22]
soccer              15
chess               10
art                 13
dtype: object

7. Create a series from dictionary – with index

# 7. Create a series from dictionary - with index
dict_enrollment = {"basketball": 21, "soccer": 15, "chess": 10, "art": 13}
s7 = pd.Series(dict_enrollment, index=["art", "soccer", "chess", "basketball"])
s7
Output:
art           13
soccer        15
chess         10
basketball    21
dtype: int64

As index values are specified in pd.Series() command, the order of the series will be based on index order, and not the order of dict labels.

8. Create a series from dictionary – additional values mentioned in index

Take note of the label “tennis” in index. As corresponding value is not given in the dictionary (dict_enrollment), the value for it in the series is noted as NaN. Like this, all missing values will be filled with Nan.

# 8. Create a series from dictionary - additional values mentioned in index
dict_enrollment = {"basketball": 21, "soccer": 15, "chess": 10, "art": 13}
s8 = pd.Series(dict_enrollment, index=["art", "soccer", "chess", "basketball", "tennis"])
s8
Output:
art           13.0
soccer        15.0
chess         10.0
basketball    21.0
tennis         NaN
dtype: float64

9. Create a series with a constant number

Same number is filled many times as mentioned in the index. Here, index has five values from 0 to 4. So, the number 25 is filled 5 times when the series is created.

# 9. Create a series with a constant number
s9 = pd.Series(25, index=[0, 1, 2, 3, 4])
s9
Output:
0    25
1    25
2    25
3    25
4    25
dtype: int64

10. Create a series where the index is a numpy array

In all the above examples, I have used a list to specify index values. Below, I am using a numpy array to mention index.

# 10. Create a series where the index is a numpy array
s10 = pd.Series([21, 22, 23, 24], index=np.arange(3, 7))
s10
Output:
3    21
4    22
5    23
6    24
dtype: int64

Thanks for reading my post. If you find this helpful, please consider following this website on Youtube / Facebook / Twitter / Linkedin.

(Featured Image: Image by Erik Stein from Pixabay)





Source link

Spread the Word!