Ventu API¶

class ventu.ventu.Ventu(req_schema, resp_schema, use_msgpack=False, *args, **kwargs)[source]¶

Ventu: built for deep learning model serving

Parameters:	req_schema – request schema defined with `pydantic.BaseModel` resp_schema – response schema defined with `pydantic.BaseModel` use_msgpack (bool) – use msgpack for serialization or not (default: JSON) args – kwargs –

To create a model service, inherit this class and implement:

preprocess (optional)

postprocess (optional)

inference (for standalone HTTP service)

batch_inference (when working with batching service)

app¶: Falcon application with SpecTree validation

batch_inference(batch)[source]¶

batch inference the preprocessed data

Parameters:	batch – a list of data after `preprocess`
Returns:	a list of inference results

health_check(batch=False)[source]¶

health check for model inference (can also be used to warm-up)

Parameters:	batch (bool) – batch inference or single inference (default)
Return bool:	`True` if passed health check

inference(data)[source]¶

inference the preprocessed data

Parameters:	data – data after `preprocess`
Returns:	inference result

postprocess(data)[source]¶

postprocess the inference result

Parameters:	data – data after `inference` or one item of the `batch_inference`
Returns:	as defined in `resp_schema`

preprocess(data)[source]¶

preprocess the data

Parameters:	data – as defined in `req_schema`
Returns:	this will be the input data of `inference` or one item of the input data of `batch_inference`

run_http(host=None, port=None)[source]¶

run the HTTP service

Parameters:	host (string) – host address port (int) – service port

run_tcp(host=None, port=None)[source]¶

run as an inference worker with TCP

Parameters:	host (string) – host address port (int) – service port

run_unix(addr=None)[source]¶

run as an inference worker with Unix domain socket

Parameters:	addr (string) – socket file address

sock¶

socket used for communication with batching service

this is a instance of ventu.protocol.BatchProtocol

Config¶

Check pydantic.BaseSettings

class ventu.config.Config[source]¶

default config, can be rewrite with environment variables begin with ventu_

Variables:	name – default service name shown in OpenAPI version – default service version shown in OpenAPI host – default host address for the HTTP service port – default port for the HTTP service socket – default socket file to communicate with batching service

Protocol¶

class ventu.protocol.BatchProtocol(infer, req_schema, resp_schema, use_msgpack)[source]¶

protocol used to communicate with batching service

Parameters:	infer – model infer function (contains preprocess, batch_inference and postprocess) req_schema – request schema defined with pydantic resp_schema – response schema defined with pydantic use_msgpack (bool) – use msgpack for serialization or not (default: JSON)

process(conn)[source]¶

process batch queries and return the inference results

Parameters:	conn – socket connection

run(addr, protocol='unix')[source]¶

run socket communication

this should run after the socket file is created by the batching service

Parameters:	protocol (string) – ‘unix’ or ‘tcp’ addr – socket file path or (host:str, port:int)

stop()[source]¶: stop the socket communication

HTTP service¶

class ventu.service.ServiceStatus[source]¶: service health status

class ventu.service.StatusEnum[source]¶: An enumeration.

ventu.service.create_app(infer, metric_registry, health_check, req_schema, resp_schema, use_msgpack, config)[source]¶

create falcon application

Parameters:

infer – model infer function (contains preprocess, inference, and postprocess)
metric_registry – Prometheus metric registry
health_check – model health check function (need examples provided in schema)
req_schema – request schema defined with pydantic.BaseModel
resp_schema – request schema defined with pydantic.BaseModel
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
config – configs ventu.config.Config

Returns:

a falcon application