Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data-science libraries in Python.

0
votes
0answers
5 views

In a pandas DataFrame with readings for multiple systems, how can I calculate daily averages and select the most recent average for each system

I have imported data set to a pandas DataFrame. Each row is one reading (amplitude) from a specific system (id) at a specific time stamp (time_stamp). There are multiple readings from each system. I ...
0
votes
0answers
4 views

Why does Pandas/Numpy automatically round up 9999999999 to 1.000000e+10?

I have a Pandas dataframe, with 4 rows, and one of the columns (named limit) contains floating point values, where any zeros must be replaced with 9999999999 (9.999999999 billion). The column is set ...
0
votes
0answers
4 views

Apply a common user defined function on two dataframes

I have written code in pandas which is fetching 2 dataframes df1 and df2. I have a big function defined by me like def analysis(df): df['Column1'] = df['Column1'].astype(str) --------- ...
-2
votes
0answers
7 views

Is there a Python library to group values of a column based on time interval when timestamps are specified in another column?

For each unique time stamp in TimeStamp column, I want all the events occurring 2.5 mins( 2 mins 30 secs) before and 2.5 mins after the particular TimeStamp (Total 5 mins interval) to be grouped and ...
0
votes
0answers
16 views

How to fix this “['cesium'] not in index” [duplicate]

I am doing an assignment revolving around the isotope Caesium-137 and I do not know how to fix this index error. I have not tried anything because I really have no background knowledge about Python. ...
0
votes
0answers
6 views

Is there any better way to use string method of pandas to search for multiple strings without regular expression

Wondering if there is any better solution to search for a string satisfying multiple expressions at the same time without using regular expression in pandas. Already seen other posted answers, most ...
1
vote
1answer
8 views

Groupby and remove with condition in pandas

Hello I have a dataframe such as : Group_name Event colomn1 colomn2 colomn3 colomn4 Group1 1 1 1 1 0 Group1 2 2 2 4 2 Group1 3 2 2 4 2 Group2 1 8 8 8 0 Group3 ...
0
votes
0answers
10 views

Discrepancies when creating weights

I have quite a large dataset (test_all) of workplace level data by country and size of the workplace. I am creating weights calculating the conditional frequency of the workplaces based on two ...
2
votes
2answers
9 views

Aggregating list items of duplicate rows in a Pandas DataFrame

Given a dataframe with a key column and a list column: Key List 0 K1 [A, B] 1 K1 [C] I want to aggregate the lists of rows where Key is the same, i.e.: Key List 0 K1 [A, B, C]...
0
votes
0answers
11 views

Create and assign recursively to dataframes in Pandas

I want to read csv files from a directory and assign each to a different dataframe. I have tried to do so like this: path = r'C:\Users\A\Documents\Dash' files = glob.glob(path + "/*.csv") for file in ...
1
vote
2answers
16 views

In Python (Pandas), How to generate a crosstab of categorical values like this?

I have a Pandas dataframe like this device_id content a X a Z b Y c X c Y d Z e Z e ...
0
votes
1answer
14 views

How to increase the speed of calculate string similarity score within dataframe?

I have a dataframe as follow: df = pd.DataFrame(data=[[1, 'Berlin',], [2, 'Paris', ], [3, 'Lausanne', ], [4, 'Bayswater',], [5, 'Table Bay', ], [6, 'Bejing',],...
0
votes
1answer
13 views

Transform coordinates Seaborn/Matplotlib

How can I transform coordinates of the plot, where 0,0 is the bottom left, and 1,1 is the top right, to the coordinate of the X and Y values shown on the graph below? What I want to achieve is to ...
0
votes
0answers
15 views

Pandas function to insert rows into a table

I am trying to insert records into a Netezza table by reading a CSV file into a pandas dataframe , but keep getting the key error. KeyError: ('columnname', 'totalCount', 'distinctValuesCount') Am i ...
0
votes
1answer
16 views

Equally split value when upsampling data

Using the pandas library in Python, I'm trying to upsample some data from monthly to daily values. Is there a way to evenly split a value over its resulting 'children'? As an example, let's say I ...
1
vote
2answers
20 views

How to change order columns and combine them together in a single pandas dataframe?

I got two pandas data frames below. How can I change the order of the columns and combine them? Thanks! +-----+---------------+-----------------+ | Id | Name | Title | +-----+-----...
1
vote
3answers
24 views

matplotlib: assigning different hatch to bars

I have a dataframe where for each index, I have to plot two bars (for two series). The following code gives the output as: import pandas as pd import numpy as np import matplotlib.pyplot as plt df = ...
2
votes
1answer
33 views

Sum of Values with Specific Condition

I would like to ask a question about pandas/python. Let us say I have two columns. I want to find the cumulative sum of values of my first column until the value of my second column reaches a specific ...
0
votes
1answer
9 views

Extracting nested XML elements of different sizes into Pandas

Lets assume we have an arbitrary XML document like below <?xml version="1.0" encoding="UTF-8"?> <programs xmlns="http://something.org/schema/s/program"> <program xmlns:xsd="http://www....
0
votes
0answers
10 views

Python number of rows by group and assign into a new column [duplicate]

i'm doing some data transformation using python. I'm hoping to get the count number of rows by group (idea refer to count number rows with groupby pandas & How to group and count rows by month and ...
0
votes
1answer
14 views

Different behaviour of numpy sum min max functions when aggregating or when applied to list or array

I see different behavior when applying the same numpy function as an aggregation function of groupby or to the same list of values, when nan values are involved. This applies to np.sum np.min np.max ...
1
vote
2answers
14 views

count the number of unique combinations in pandas data frame

I am having trouble (brain block) producing some simple summary statistics for my data. What I would like to do is to count the number of co-occurring "code" values across all "id"s. The data looks ...
-1
votes
1answer
15 views

How to use pd.get_dummies for all the columns of one data frame?

I want to do dummy variable encoding to all my columns of the data frame together, i.e without specifying the column names. I have tried the following codes but failed. It either throws a Memory ...
0
votes
2answers
25 views

How can i append dataframe from pandas to the oracle table?

I want to append dataframe (pandas) to my table in oracle. But this code deletes all rows in table:( My dataframe and my result become this: 0, 0, 0, ML_TEST, 0, 5 // 0, 0, 0, ML_TEST, 0, 6 ...
0
votes
2answers
30 views

Calculate percentile in pandas

I have a data set named join2 like this pd.DataFrame({'id' : [197, 220, 278, 300, 303, 318, 326, 339, 354, 382, 407, 432, 433, 440, 441, 447, 454, 501, 504, 508, 550, 564,601, 602, 606,628,643, ...
0
votes
0answers
9 views

Python Pandas Dataframe - Boolean indexing with function over whole column

I'm currently working on a pricing tool and I'm currently improving the quality of my code to make it run faster. I have created a function that replaces in a string all the words with '@' before by ...
-2
votes
2answers
23 views

How to remove the special characters in the columns and convert the columns into float

I would like to remove the first and last characters in the column and convert the column into float. The column type is object. my column data like this : train['longtitude'].head() 0 ...
0
votes
1answer
18 views

Changed the x axis spine color based on two numpy arrays

My code looks like this: price = np.array(df.price) low_price = np.array(df['mid price']) price is equal to: [nan,nan, 2, 3, nan,nan, 1] low_price is equla to : [1, 2, 2, 3, 3, 4, 1] When I plot ...
-2
votes
2answers
26 views

some code to combine txt files with removing words in the beginning

Good day to everyone! The thing is I have some txt files and I have script to put them together. Every txt file start from: Export Type: by LAI\GCI\SAI LAI\GCI\SAI: ...
0
votes
2answers
26 views

Apply function to multiple columns, when column headers are in list

I have a function below, which is designed to work on one column I have a list of column headers, in dataframe df. cols=[col1,col2,col3] def retention(adstock_rate, df,variablename): ...
4
votes
3answers
68 views

Why is there so much speed difference b/w these 2 variants?

Version 1 import string, pandas as pd def correct_contraction1(x, dic): for word in dic.keys(): if word in x: x = x.replace(word, " " + dic[word]+ " ") return x Version ...
1
vote
1answer
20 views

How to find Exact String match using Lambda function while comparing frozen sets?

While following this answer by @adrtam. I tried to find exact match for line using A) print(rules[rules["antecedents"].apply(lambda x: 'line' in x)]) and B) print(rules[rules["antecedents"].apply(...
3
votes
2answers
27 views

Dataframe groupby - list of values

I have a following dataframe: driver_id status dttm 9f8f9bf3ee8f4874873288c246bd2d05 free 2018-02-04 00:19 9f8f9bf3ee8f4874873288c246bd2d05 busy 2018-02-04 01:...
-3
votes
0answers
24 views

Delete first row (row) and set the decimal degree for the cells

What script should I add to remove the headers (first row) from output excel files? also, is it possible to set the decimal degree for the cells in the output files? Thank you. My code is : import ...
1
vote
1answer
22 views

Generate lists of all columns in pandas dataframe

My dataframe has 40+ columns. I would like to generate lists with each list containing values from one column. Here is how I tried to do it cols= df.columns cols = cols.tolist() for col in cols: ...
0
votes
1answer
17 views

Pandas groupby and short value and take top 3 With Rank unique in python?

I have Data Frame Like this Val1 Val2 0 a 1.0 1 a 1.0 2 a 0.98 3 a 0.78 4 a 0.70 5 b 0....
1
vote
1answer
12 views

Checking if a two values in a df in two different columns exist in a different df?

I have a dfa and dfb that both look like such as below, id start_time ab23 2019-04-01 23:00:00.000 bv63 2019-04-01 23:15:00.000 ab20 2019-04-01 21:00:00.000 bv43 2019-04-01 22:...
0
votes
0answers
14 views

an error with dimensions in CNN using tensorflow

I am doing a basic tensorflow driven CNN. There is some kind of dimension which I am unable to locate. thanks in advance I am working with jupyter in my system. I run on miniconda environment. pred =...
3
votes
0answers
28 views

Pandas assignment vs inplace=True on .loc? [duplicate]

After reading the Pandas docs and some articles my understanding was that using .loc instead of chained indexing will give a view instead of a copy... so in the code below why does inplace=True fail ...
1
vote
1answer
47 views

Creating a python list from data frame based on conditions

I am trying to generate a list from a pandas data frame based on certain conditions on column values in data frame, my df looks something like df = 48 150 39 0 ...
-1
votes
1answer
10 views

Python Solution to Pivot and Aggregate Data Frame [duplicate]

I have a two column pandas dataframe, first column has numerical codes and second has discrete values. I'm trying to group by the numerical code and pivot the discrete values into new columns while ...
-3
votes
4answers
32 views

Wriitng Multiple Python List to CSV

I am trouble writing python list to csv... There is code, the script will return some some lists, i need all the list should be written in csv. import datetime as dt import pandas as pd import csv ...
0
votes
1answer
24 views

Why we need to use lambda function in the last example in the api of “pandas.DataFrame.iloc”?

We can see the last example in Click here >>> df.iloc[:, lambda df: [0, 2]] a c 0 1 3 1 100 300 2 1000 3000 I practise it and think we can abandon "lambda df:". I ...
-1
votes
0answers
29 views

how to convert a excel to dict with python pandas [duplicate]

For example this is a excel name age a 16 b 17 c 18 i want to get output like { 'name': ['a', 'b', 'c'], 'age': [16, 17, 18] } Is there any good way to do it?
0
votes
0answers
36 views

Why the function of fillna is invalid syntax in there? [on hold]

I run the following codes in Jupyter, but it shows the SynataxError: invaild syntax in "year_genr.fillna(value=0,inplace=True')". So where the syntax is wrong? year_genr = pd.DataFrame(index=...
1
vote
0answers
18 views

Pandas 0.24.0 breaks my pandas dataframe with special column identifiers

I had code that worked fine until I tried to run it on a coworker's machine, whereupon I discovered that while it worked using pandas 0.22.0, it broke on pandas 0.24.0. For the moment, we've solved ...
-1
votes
0answers
23 views

how to use pandas to organize sales data into 12 months and find the 10 most profitable products for those 12 months?

I need write a program that organizes the data in the provided spreadsheet to find the top 10 most profitable products by each month. The program needs to take an input from the user to specify the ...
0
votes
0answers
17 views

Passing in the dtypes for many columns at once when creating a dataframe through csvs

I have a very large file with many columns (think 50+) and need to pass in the dtypes for all of them as strings to avoid mixed types. How can I do this efficiently without having to specify the ...
1
vote
0answers
14 views

“not all arguments converted during string formatting” when using data.to_sql

I'm trying the following piece of code, which I found in a 2016 book: import MySQLdb import pandas as pd # database setup omitted for the sake of brevity nr_customers = 100 colnames = ["movie%i" %i ...
0
votes
2answers
29 views

How to transform string into binary records?

I have such base here. df = pd.read_csv('c:/1/Autism_Data.arff',na_values="?") I need to transform columns: "gender", "jundice", "austim" into binar records 0-1. I would like to see this table like ...

http://mssss.yulina-kosm.ru