Rate this Page

Parameter Servers#

This module provides a prototype implementation of a fault tolerant parameter server bulit on the reconfigurable ProcessGroups.

class torchft.parameter_server.ParameterServer(port: int, store_port: int = 0)[source]#

Bases: ABC

This implements a threaded parameter server using the torchft reconfigurable ProcessGroups.

address() str[source]#

Returns the HTTP address to create a new session on this server.

Format: http://host:port/new_session

Returns

an HTTP address

abstract forward(session_id: str, pg: ProcessGroup) None[source]#

This method will be called once per session in a dedicated thread. To support multiple operations on a single session you should put a for-loop in your forward implementation.

If an error occurs, the process group will be freed and the client will have to create a new session.

The server rank is 0 and the client rank is 1.

Must be implemented by subclasses.

Parameters
  • session_id – a unique uuid for this session

  • pg – the ProcessGroup that’s configured for the client.

abstract classmethod new_process_group() ProcessGroup[source]#

Create a new non-configured ProcessGroup for the ParameterServer to configure when setting up server and client connections.

Must be implemented by subclasses.

Returns

a new ProcessGroup

classmethod new_session(address: str) ProcessGroup[source]#

Creates a new session on the parameter server and returns a ProcessGroup configured for that server.

Client is rank 1, server is rank 0.