-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dhcp class + using usi in daq #520
Conversation
Codecov Report
@@ Coverage Diff @@
## master #520 +/- ##
==========================================
- Coverage 79.89% 79.58% -0.32%
==========================================
Files 21 21
Lines 3621 3649 +28
==========================================
+ Hits 2893 2904 +11
- Misses 728 745 +17
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something's not right with the refactoring -- it doesn't match what was in the design proposal and I'm really not convinced there should be multiple dhcp containers.
bin/usi
Outdated
@@ -1,6 +1,6 @@ | |||
#!/bin/bash -e | |||
|
|||
echo Starting USI | |||
docker run -d --network=host --name daq-usi daqf/usi | |||
docker run -d --network=host --name daq-usi daqf/usi || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under what condition is failure here ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
topo integration test doesn't need it so the image wasn't build for that test. I could either build this image for topo for ignore the failure.
Something's not right with the refactoring -- it doesn't match what was in the design proposal and I'm really not convinced there should be multiple dhcp containers.
The refactoring wasn't covered in the design proposal so which part were you referring to?
config/system/default.yaml
Outdated
@@ -37,3 +37,6 @@ long_dhcp_response_sec: 105 | |||
|
|||
# finish hook: executed at the end of every test | |||
finish_hook: bin/dump_network | |||
|
|||
# usi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either make comment longer or remove
config/system/default.yaml
Outdated
@@ -37,3 +37,6 @@ long_dhcp_response_sec: 105 | |||
|
|||
# finish hook: executed at the end of every test | |||
finish_hook: bin/dump_network | |||
|
|||
# usi | |||
usi_url: localhost:5000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline at end of line/file
daq/dhcp_server.py
Outdated
self._initialize() | ||
except Exception as e: | ||
LOGGER.error( | ||
'Gateway initialization failed, terminating: %s', str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!Gateway
daq/dhcp_server.py
Outdated
def _initialize(self): | ||
LOGGER.info('Initializing DHCP server on port %d', self._host_port) | ||
cls = docker_host.make_docker_host( | ||
'daqf/networking', prefix='daq', network='bridge') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why networking for dhcp server?
daq/host.py
Outdated
@@ -602,6 +641,7 @@ def _get_switch_config(self): | |||
return { | |||
'ip': self.switch_setup.get('ip_addr'), | |||
'model': self.switch_setup.get('model'), | |||
'telnet_port': int(self.switch_setup.get('telnet_port', self._DEFAULT_TELNET_PORT)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'telnet_port' isn't right -- what is it used for? usi_port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the telnet port for usi to talk to the switch. usi port is defined in the usi_url config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I mentioned elsewhere "telnet_port" is way to generic and not all that interesting (telnet is a protocol) -- the name should indicate more the use not the protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the _DEFAULT_TELNET_PORT should be more descriptive but the telnet_port is already under the switch_setup config so isn't saying switch_telnet_port redundant? same with username, password, model etc.
daq/host.py
Outdated
@@ -59,6 +64,7 @@ class ConnectedHost: | |||
"""Class managing a device-under-test""" | |||
|
|||
_STARTUP_MIN_TIME_SEC = 5 | |||
_DEFAULT_TELNET_PORT = 23 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs more qualifiers... maybe add switch in there?
proto/system_config.proto
Outdated
@@ -114,6 +114,9 @@ message DaqConfig { | |||
|
|||
// Set time between port disconnect and host tests shutdown | |||
int32 port_flap_timeout_sec = 48; | |||
|
|||
// USI url | |||
string usi_url = 49; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would encapsulate this in a message -- it has a feeling that eventually it might have more parameters.
proto/system_config.proto
Outdated
@@ -139,6 +142,9 @@ message SwitchSetup { | |||
|
|||
// IP address template and subnet for module ip addresses | |||
string mods_addr = 16; | |||
|
|||
// Switch connect Telnet port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a slightly better name "telnet_port" is very generic... maybe cmd_port? Talk about what it's use is, not the underlying protocol.
In the proposal diagrams all the DHCP servers are inside the Gateway
container, not separate containers.
I would build the container for the topo tests. I'd rather keep the hard
failure in case of an error.
…On Tue, Jun 30, 2020 at 8:27 PM henry54809 ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In bin/usi
<#520 (comment)>:
> @@ -1,6 +1,6 @@
#!/bin/bash -e
echo Starting USI
-docker run -d --network=host --name daq-usi daqf/usi
+docker run -d --network=host --name daq-usi daqf/usi || true
topo integration test doesn't need it so the image wasn't build for that
test. I could either build this image for topo for ignore the failure.
Something's not right with the refactoring -- it doesn't match what was in
the design proposal and I'm really not convinced there should be multiple
dhcp containers.
The refactoring wasn't covered in the design proposal so which part were
you referring to?
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#520 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIEPD3KFGCTR7FMUNKVXETRZKUKTANCNFSM4OM2T7IQ>
.
|
You mentioned that we should limit to 1 DHCP server per container so I thought the diagram was clear in that the DHCP servers are part of a gateway but are different containers; plus I don't think it's possible to run multiple dnsmasq on the same container since they will attempt to listen on the same port. Original intention was for gateway to be its own container with multiple DHCP servers attached but I realized the gateway would be doing nothing at the point so I just made the first DHCP server the gateway container.
|
Yes -- that is true. I think my reaction is something more about describing
the use, not the protocol... e.g. cmd_port, management_port, etc...
…On Tue, Jun 30, 2020 at 9:18 PM henry54809 ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In daq/host.py
<#520 (comment)>:
> @@ -602,6 +641,7 @@ def _get_switch_config(self):
return {
'ip': self.switch_setup.get('ip_addr'),
'model': self.switch_setup.get('model'),
+ 'telnet_port': int(self.switch_setup.get('telnet_port', self._DEFAULT_TELNET_PORT)),
I understand that the _DEFAULT_TELNET_PORT should be more descriptive but
the telnet_port is already under the switch_setup config so isn't saying
switch_telnet_port redundant? same with username, password, model etc.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#520 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIEPDY5YFDV3SWV4V5O7Z3RZK2JRANCNFSM4OM2T7IQ>
.
|
I don't think that's a safe assumption to make about the diagram. The proposal/diagram should be a very clear description of the relevant aspect. The gateway is a container, and the diagram shows DHCP servers in that container as the simplest explanation. This is the kind of complexity I was talking about when I said that I don't think we should be considering the multiple-DHCP servers if it complicates the architecture at all. You're complicating the (conceptual) architecture for something that's not needed. The "gateway" is the higher-order concept than the DHCP server, since it's a container that does a bunch of different things one of which is being a DHCP server, so it does not make sense to invert the relationship. If we need multiple servers in the future then we can cross that bridge when we come to it but for now, the basic structure of "the gateway" should remain constant. It's not a big structural difference over what you have now, it's just organized slightly differently. I would recommend you convert your new DHCPServer class to a GatewayHost class and then we can deal with the ramifications of multiple of them later. It's essentially keeping the refactoring smaller, so not changing the significant semantics (e.g. the gateway construct remains unchanged), but it's setting the groundwork for a different change in the future (e.g. the re-introduction of a DHCPServer or something).
|
Updated to use USI OVS switch. PTAL. |
bin/usi
Outdated
@@ -0,0 +1,18 @@ | |||
#!/bin/bash -e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's weird and a bit off, but for now I think this should go in cmd/ and not bin/ -- simply because it's most similar to things like cmd/faux which do the same thing (start container). At some point we have to clean everything up so it makes sense, but for now I'm going with consistency...
cmd/exrun
Outdated
@@ -81,7 +82,8 @@ sudo rm -f $cleanup_file | |||
function autostart { | |||
tmp=`mktemp` | |||
echo DAQ autostart $@ | |||
eval $@ | tee $tmp | |||
eval $@ > $tmp #Don't use "eval $@ | tee $tmp" here. Doesn't work for bin/usi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after #, and period at end of sentence.
@@ -31,7 +33,7 @@ protected String getInterfaceByPort(int devicePort) throws FileNotFoundException | |||
Matcher m = pattern.matcher(line); | |||
return m.find(); | |||
}).findFirst().get(); | |||
|
|||
bufferedReader.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generall the java try-with-resources pattern is better than explicitly closing the object. https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html
daq/gateway.py
Outdated
LOGGER.info('Initializing gateway %s as %s/%d', | ||
self.name, host_name, host_port) | ||
self.tmpdir = self._setup_tmpdir('inst', host_name) | ||
self.host = self._start_dhcp_server(host_port).host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per previous discussion, the top-level primary construct here should be the gateway (or "networking" container if you prefer). For the simple case, DHCP should be a sub-component of that.
Decided to try out a gateway container as the top level construct with separate dhcp server container(s). All changes related to that are contained in the last commit. I think the changes maintain the current structure but offer more flexibility without much complexity. There can be more optimization in terms of the starting of the DHCP server containers but I want to see what you think about this first. |
I think the underlying problem is that you're conflating two different
concepts and so when you're trying to separate out multiple DHCP servers
it's breaking the abstraction in other layers of the (conceptual) stack.
Part of the problem I'll admit is that the terms "gateway" and "networking"
(container) are both ambiguous and unnecessarily different (e.g. it should
have been the networking class that runs the networking container...). The
container really represents a host in the networking sense, which can
manage a bunch of different networking services (NTP, DNS, DHCP, NAT,
etc...). So, for the simple case there should be one instance of this that
encapsulates all of the necessary networking services (including DHCP).
This is more or less a manifestation of something like my home network box
that does this stuff. Key concept here is that DHCP is a service that runs
inside of this container.
Now comes the question of handling multiple DHCP servers, if/as/when
necessary. To do that, it might require spinning up an additional
(networking) container that *only* does DHCP and nothing else. So, the
property of the simple case (only one) doesn't complicate the system, but
it still allows for multiple server hosts if/as necessary. Conceptually,
this is the same networking services image, it just happens to be invoked
in a way that only does DHCP and not the other stuff. Generalizing, there
might be some other redundant service as well and this would provide a
container that could also provide that (e.g. a host can be configured to
have multiple DNS servers).
I think the problem can be distilled down to the name of the
"_start_dhcp_server" method -- which is really starting a dhcp container
(not just server). It should be something like
"_start_networking_host(secondary=True)" or something like that. And then
the first instance would be just "_start_networking_host()". So, from a
container perspective there's only one kind (networking services), and the
first instance runs all sorts of stuff. And non-primary instances of it
are more or less the same just running a diminished set of services.
I would recommend starting off by trying to diagram this cleanly -- just
one set of symbols for containers, and then a representation of the
services that run inside of those containers... would be a lot easier to
diagram and ratify before refactoring the code. And then there's two
versions -- one for the simple case that just supports one DHCP server, and
then another that supports two. The problem we ran into here is your
original diagram was ambiguous (given that we interpreted it differently)
and that disconnect never got addressed before moving on to code.
…On Thu, Jul 9, 2020 at 11:49 AM henry54809 ***@***.***> wrote:
Decided to try out a gateway container as the top level construct with
separate dhcp server container(s). All changes related to that are
contained in the last commit. I think the changes maintain the current
structure but offer more flexibility without much complexity. There can be
more optimization in terms of the starting of the DHCP server containers
but I want to see what you think about this first.
PTAL
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#520 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIEPD56SKLJY6ISSM35X5LR2YGKLANCNFSM4OM2T7IQ>
.
|
This reverts commit 1f300e2.
I think it's best to keep the original gateway class for now. I've reverted changes related to gateway and dhcp server. Spawning more networking containers(separate from gateway or not) does bring more complexity with the current ntp setup which will take more time that we can afford right now to resolve. We can revisit this when we need to convert gateway for the orchestrated testing / support multiple dhcp servers. PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one very small comment about resource handling with BufferedReader
Matcher m = pattern.matcher(line); | ||
return m.find(); | ||
}).findFirst().get(); | ||
bufferedReader.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close is handled implicitly by the try clause, so you shouldn't do it explicitly here
No description provided.