Redis perform calculation with python to aggregate timestamps pe

ghz 9months ago ⋅ 88 views

Redis perform calculation with python to aggregate timestamps per day

I have a Redis stack docker container running on localhost with millions of medical articles.

I would like to group them per day to be able to calculate the number of articles published each day and other things later on (each of them as a field 'published' storing a Numeric Value which is a Unix timestamp.

What I have done so far:

import redis
from datetime import datetime
from redis.commands.search.query import NumericFilter, Query
from redis.commands.search.aggregation import AggregateRequest
from redis.commands.search.reducers import count
import redis.commands.search.aggregation as aggregations
import redis.commands.search.reducers as reducers

# Connect to Redis
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

#Trying to aggregate to count the number of articles per day
req = aggregations.AggregateRequest("*").group_by(['@published - (published % 86400)'], reducers.count().alias("total"))
r.ft("idx:articles").aggregate(req).rows

but i get :

    raise response
redis.exceptions.ResponseError: No such property `@published - (@published % 86400)`

I have properly created my index because I am able to retrieve all the articles between two timestamps for example.

Thank you for your help !

Answers

The error you're encountering suggests that the property @published - (@published % 86400) is not recognized by the Redis search engine as a valid property for grouping.

When using Redis Search, you typically need to define properties during index creation, and then you can use those properties for querying, filtering, and aggregating data. It seems like the property you're trying to use for grouping (@published) might not be defined or indexed in the way you expect.

To group articles per day based on their publication timestamp, you need to ensure that the @published property is defined and indexed correctly in your Redis Search index. If it's not, you'll need to update your index definition to include the @published property.

Here's how you can do it:

  1. Define Index Schema: Make sure that your Redis Search index schema includes a definition for the @published property. You can define it as a numeric property to store Unix timestamps.
# Example index schema definition
schema = [
    "FIELD", "published", "NUMERIC",
    "SORTABLE", "NOINDEX"
]
  1. Create or Update Index: If your index already exists, you may need to update it to include the @published property. If not, you'll need to create a new index with the appropriate schema.
# Create or update index
r.ft_create_index("idx:articles", *schema)
  1. Grouping with Aggregate Request: Once the @published property is properly defined and indexed, you should be able to use it for grouping in your aggregate request.
req = aggregations.AggregateRequest("*").group_by(['@published - (@published % 86400)'], reducers.count().alias("total"))
result = r.ft("idx:articles").aggregate(req)
print(result.rows)

Make sure to replace schema with your actual index schema definition and adjust it according to your needs.

By ensuring that the @published property is correctly defined and indexed in your Redis Search index, you should be able to group articles per day based on their publication timestamp without encountering the error.