Given a list of dates, group the dates in a successive day ranges from the initial date of the list. We will form a group of each successive range of K dates, starting from the smallest date.
Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 10
Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0)]), (1, [datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]
Explanation : 27 Dec – 4 Jan is in same group as diff. of dates are less than 10, successively, each set of dates are grouped by 10 days delta.
Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 14
Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0)]), (1, [datetime.datetime(2020, 1, 10, 0, 0), datetime.datetime(2020, 1, 20, 0, 0)])]
Explanation : 27 Dec – 7 Jan is in same group as diff. of dates are less than 14, successively, each set of dates are grouped by 14 days delta.
Method : Using groupby() + sort()
In this, we sort the dates and then perform grouping of a set of dates depending upon grouping function.
Python3
from itertools import groupby
from datetime import datetime
test_list = [datetime( 2020 , 1 , 4 ),
datetime( 2019 , 12 , 30 ),
datetime( 2020 , 1 , 7 ),
datetime( 2019 , 12 , 27 ),
datetime( 2020 , 1 , 20 ),
datetime( 2020 , 1 , 10 )]
print ( "The original list is : " + str (test_list))
K = 7
min_date = min (test_list)
def group_util(date):
return (date - min_date).days / / K
test_list.sort()
temp = []
for key, val in groupby(test_list , key = lambda date : group_util(date)):
temp.append((key, list (val)))
res = []
for sub in temp:
intr = []
for ele in sub[ 1 ]:
intr.append(ele.strftime( "%Y/%m/%d" ))
res.append((sub[ 0 ], intr))
print ( "Grouped Digits : " + str (res))
|
Output:
The original list is : [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2020, 1, 20, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]
Grouped Digits : [(0, [‘2019/12/27’, ‘2019/12/30’]), (1, [‘2020/01/04’, ‘2020/01/07’]), (2, [‘2020/01/10’]), (3, [‘2020/01/20’])]
Method #2: Using Sort and iterate
Approach
1. Sort the list of dates in ascending order.
2. Initialize a list of tuples to store the groups.
3. Initialize variables to keep track of the current group number and the start date of the current group.
4. Iterate through the sorted list of dates, comparing the current date with the start date of the current group.
5. If the difference between the current date and the start date is less than or equal to K days, add the current date to the current group.
6. If the difference between the current date and the start date is greater than K days, create a new group with the current date as the start date and add the current date to the new group.
7. Return the list of tuples.
Algorithm
1. Sort the given list of dates in ascending order.
2. Initialize an empty dictionary to store the groups of dates.
3. For each date in the sorted list, calculate the number of days since the previous date using the timedelta function.
4. If the number of days is greater than K, add the date to a new group. Otherwise, add the date to the previous group.
5. Convert the dictionary into a list of tuples and return the result.
Python3
from datetime import datetime, timedelta
from collections import defaultdict
def group_dates(dates, K):
groups = defaultdict( list )
dates.sort()
group_num = 0
start_date = None
for date in dates:
if start_date is None :
start_date = date
else :
diff = (date - start_date).days
if diff > K:
group_num + = 1
start_date = date
groups[group_num].append(date)
return list (groups.items())
dates = [datetime( 2020 , 1 , 4 ),
datetime( 2019 , 12 , 30 ),
datetime( 2020 , 1 , 7 ),
datetime( 2019 , 12 , 27 ),
datetime( 2020 , 1 , 20 ),
datetime( 2020 , 1 , 10 )]
K = 7
print (group_dates(dates, K))
|
Output
[(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), (1, [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]
Time complexity: O(n log n) – sorting the list of dates takes O(n log n) time, where n is the number of dates. The loop that iterates through the sorted list of dates takes O(n) time.
Auxiliary Space: O(n) – we store the groups of dates in a dictionary that can potentially contain n elements.
Method 3 : use a while loop to iterate over the dates and create groups based on the K value.
Approach:
- Sort the dates in ascending order
- Initialize an empty list called “groups”
- Set a variable called “current_group” to 0
- Set a variable called “group_start_date” to the first date in the sorted list
- Set a variable called “group_end_date” to None
- While there are still dates left in the list:
- Get the next date in the list
- If the difference between the current date and the group start date is less than or equal to K:
- Set the group end date to the current date
Else:
- Append the current group (i.e., the dates between the group start date and the group end date) to the “groups” list
- Set the group start date to the current date
- Set the group end date to None
- Increment the current group number
- Append the final group to the “groups” list
- Return the “groups” list.
Python3
from collections import defaultdict
from datetime import datetime, timedelta
def group_dates(dates, K):
groups = defaultdict( list )
dates.sort()
group_num = 0
start_date = None
for date in dates:
if start_date is None :
start_date = date
else :
diff = (date - start_date).days
if diff > K:
group_num + = 1
start_date = date
groups[ str (group_num)].append(date)
print (groups)
return list (groups.items())
dates = [datetime( 2020 , 1 , 4 ),
datetime( 2019 , 12 , 30 ),
datetime( 2020 , 1 , 7 ),
datetime( 2019 , 12 , 27 ),
datetime( 2020 , 1 , 20 ),
datetime( 2020 , 1 , 10 )]
K = 7
print (group_dates(dates, K))
|
Output
defaultdict(<class 'list'>, {'0': [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)], '1': [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)], '2': [datetime.datetime(2020, 1, 20, 0, 0)]})
[('0', [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), ('1', [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), ('2', [datetime.datetime(2020, 1, 20, 0, 0)])]
Time complexity: O(n), where n is the number of dates in the input list.
Auxiliary space: O(1) since it only uses a fixed number of variables.