pd.concat - joins datasets together and pd.merge does it on a common column.
My question is when working with large datasets would it be safe to merge on 'date' rather than join two datasets via concat that might not align on dates.
Just my paranoia kicking in. Thanks.
Hi Jammy,
Since you want to combine the datasets based on the "date" column, yes, the merge method would be better. Concat would either stitch the datasets one below the other (if axis = 0) or next to each other (if axis = 1).
The use case for both methods is different. If you need more details about when to use either of the methods, you can check that out here.
Hope this is helpful!
Thanks,
Rushda Ansari