With any external resource, chances are things are going to go wrong sometimes. Following on from issue #98 there are various situations which might require manual intervention. Shutting down a bundle of connections/channels between two specific endpoints, forcibly terminating an endpoint or even shutting down a whole transport might be required if connectivity drops in such a way that things get stuck.

Having an API for this would enable us to write tools (command line, web based, etc) to assist with manual administration of a distributed system. The API needs to support the primary use case where an administrator connects to a running node from an external location and can take actions from there.

There needs to be some entry point to which the external client connects, and of course there is no such concept in `Network.Transport`. I can see two ways to go about this, though there may be other possibilities too.

1. couple the functionality with the node controller
2. provide a service registry as part of `Network.Transport` itself

The point here is that in any running executable, we need some means by which we can connect in order to query for this information. As the node controller already provides this, it seems a sensible choice at first glance. The node controller is initialised with a `Transport` so it can use the `Network.Transport` APIs to handle requested interactions.

So does it make sense to force all the interactions to go through the node controller? Another alternative would be to have a registered service process that gets booted with each node controller, and use `nsend` to talk to this process instead. Either way, the API data needs to reside in core CH so that the nodes can communicate effectively without sharing the same image.

Providing some kind of service registry for the `Network.Transport` itself is probably wrong. We'd need to provide an access point to the outside world and it seems crazy not to use the node controller(s) for this. I suppose one way of doing this would be to have the backends open up an additional management port and use a separate control channel for management messages - not sure what I think about forcing that on all backends though, and as @edsko mentioned elsewhere we're trying to keep actual functionality out of the `Network.Transport` layer and push it to the implementations. Forcing each implementation to write code to handle management requests seems wrong.

One problem with using the node controller as the entry point for a management (and/or stats gathering) API is that you need to know which backend is in use. As an administrator I guess you should know that anyway, so maybe it's not a problem.

I also think that we should put a secure HTTP based API around this, so that you can open up the management capabilities without having to make connectivity possible. For example, you might not want to expose the node outside your LAN, but allow administration to take place over the internet providing TLS is in play. That probably belongs either in a separate top-level project, or in -platform, possibly bundled with other functionality into a single management web interface.




Tim Watson
October 17, 2013, 10:38 PM

See for the mechanism we will use.

In fact, the mechanism we will use is and that has been merged now. So we're ready to go on this now.

Tim Watson
July 5, 2013, 11:02 AM

See for the mechanism we will use.

Tim Watson
January 7, 2013, 7:01 PM

Comment:hyperthunk:01/07/13 07:01:05 PM:

Yes I agree with that in principle. I suspect we can do all of this in the node controller or with a service process next door to it.

Edsko de Vries
January 7, 2013, 6:41 PM

Comment:edsko:01/07/13 06:41:18 PM:

Without having thought about this too hard (so I could be wrong) I would recommend to do as much as you can at the level of CH and as little as possible at the level of NT. Transports are tricky to implement and should be kept as focused as possible.

Edsko de Vries
December 19, 2012, 1:23 PM

Comment:edsko:12/19/12 01:23:35 PM:



Tim Watson


Tim Watson

External issue ID