Ventu API

class ventu.ventu.Ventu(req_schema, resp_schema, use_msgpack=False, *args, **kwargs)[source]

Ventu: built for deep learning model serving

Parameters:
  • req_schema – request schema defined with pydantic.BaseModel
  • resp_schema – response schema defined with pydantic.BaseModel
  • use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
  • args
  • kwargs

To create a model service, inherit this class and implement:

  • preprocess (optional)
  • postprocess (optional)
  • inference (for standalone HTTP service)
  • batch_inference (when working with batching service)
app

Falcon application with SpecTree validation

batch_inference(batch)[source]

batch inference the preprocessed data

Parameters:batch – a list of data after preprocess
Returns:a list of inference results
health_check(batch=False)[source]

health check for model inference (can also be used to warm-up)

Parameters:batch (bool) – batch inference or single inference (default)
Return bool:True if passed health check
inference(data)[source]

inference the preprocessed data

Parameters:data – data after preprocess
Returns:inference result
postprocess(data)[source]

postprocess the inference result

Parameters:data – data after inference or one item of the batch_inference
Returns:as defined in resp_schema
preprocess(data)[source]

preprocess the data

Parameters:data – as defined in req_schema
Returns:this will be the input data of inference or one item of the input data of batch_inference
run_http(host=None, port=None)[source]

run the HTTP service

Parameters:
  • host (string) – host address
  • port (int) – service port
run_tcp(host=None, port=None)[source]

run as an inference worker with TCP

Parameters:
  • host (string) – host address
  • port (int) – service port
run_unix(addr=None)[source]

run as an inference worker with Unix domain socket

Parameters:addr (string) – socket file address
sock

socket used for communication with batching service

this is a instance of ventu.protocol.BatchProtocol

Config

Check pydantic.BaseSettings

class ventu.config.Config[source]

default config, can be rewrite with environment variables begin with ventu_

Variables:
  • name – default service name shown in OpenAPI
  • version – default service version shown in OpenAPI
  • host – default host address for the HTTP service
  • port – default port for the HTTP service
  • socket – default socket file to communicate with batching service

Protocol

class ventu.protocol.BatchProtocol(infer, req_schema, resp_schema, use_msgpack)[source]

protocol used to communicate with batching service

Parameters:
  • infer – model infer function (contains preprocess, batch_inference and postprocess)
  • req_schema – request schema defined with pydantic
  • resp_schema – response schema defined with pydantic
  • use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
process(conn)[source]

process batch queries and return the inference results

Parameters:conn – socket connection
run(addr, protocol='unix')[source]

run socket communication

this should run after the socket file is created by the batching service

Parameters:
  • protocol (string) – ‘unix’ or ‘tcp’
  • addr – socket file path or (host:str, port:int)
stop()[source]

stop the socket communication

HTTP service

class ventu.service.ServiceStatus[source]

service health status

class ventu.service.StatusEnum[source]

An enumeration.

ventu.service.create_app(infer, metric_registry, health_check, req_schema, resp_schema, use_msgpack, config)[source]

create falcon application

Parameters:
  • infer – model infer function (contains preprocess, inference, and postprocess)
  • metric_registry – Prometheus metric registry
  • health_check – model health check function (need examples provided in schema)
  • req_schema – request schema defined with pydantic.BaseModel
  • resp_schema – request schema defined with pydantic.BaseModel
  • use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
  • config – configs ventu.config.Config
Returns:

a falcon application