Shiny Ideas: June 2019

Friday, June 28, 2019

Bare Metal Management With Razor (4/N)

Having digressed into issues of gender and Twitter censorship, let's get back to talking about Razor.

I'm pretty happy with the system as a whole. It's easy to set up and easy to understand, and I appreciate that they've put together a bunch of off-the-shelf components in a way that facilitates extension.

The experimental install that I documented (1, 2, 3) would need significant work to support 24x7 operations. Things that would need to be done:

Redundant Razor servers. This is easy enough to accomplish, since the server itself is stateless. Just build a couple and hide them behind a VIP.
HA Postgress DB. The current Postgres docs list a number of different solutions which are supported to varying degrees.
HA DHCP. A little Googling suggests that Dnsmasq isn't awesome at failover; the recommendation seems to be to use ISC DHCP instead because it has a built-in failover protocol.

There's also the question of how you handle multiple LANs. DHCP requests, by design, are confined to a single broadcast domain. If you want Razor to be able to handle request for hosts across multiple broadcast domains then you need to overcome this limitation. Typically this is accomplished via DHCP relay, the details of which will vary depending on what DHCP server you're using.

Lastly, there's the question of IP management and DNS. If you're imaging systems, giving them names, and assigning them IP addresses semi-permanently, you'd like a system that's aware of the fact and then does the right things: updates DNS records, stops offering assigned IPs via DHCP, and so on. Dnsmasq doesn't do anything in this regard, so in a real world setting you'd want a smarter piece of software handling DNS and DHCP. ISC has a system called Kea that is intended for this use case, though I wasn't aware that it even existed until writing this post.

Anyway, in conclusion: Razor is pretty awesome, supports the bare metal management use case better than other systems I've looked at, and does OS imaging pretty well too. Its not quite as robust out-of-the-box as Foreman or MAAS in terms of things like distributed operation and IP management, but that can be overcome with a little bit of additional work.

Who Could Have Foreseen?

On the face if it, its silly that Twitter blocked David Neiwert's account. But...

THIS IS EXACTLY WHAT WE SAID WOULD HAPPEN!!!

Please don't misunderstand me, I really like David's writing. But we told all y'all over and over and over again that there was no way for Twitter (or Facebook or YouTube or...) to make nuanced judgments at scale about who/what should be blocked, that calling for certain content to be blocked would inevitably lead to other stuff getting swept up as well, and that it would be better to just forget the whole thing.

So you will find my sympathies somewhat reduced when what has come to pass is exactly what we predicted would come to pass. Innocuous content getting blocked is a known side-effect of content moderation regimes, regimes which I expect that a lot of the people carping about David's treatment actively support.

So maybe now that the "wrong people" are getting caught up in it we can all just step back and reconsider then entire concept, yes?

Tuesday, June 25, 2019

Is Sex Or Gender Assigned At Birth

This started out as a footnote to my previous post, but it turns out that there's enough meat on the bone to merit its own discussion. So, what is assigned at birth, sex or gender?

Consider the following ritual as if you're an anthropologist studying a foreign culture:

A baby is born.
The baby is examined by a designated baby examiner.
The designated baby examiner pronounces that the baby is "X".

The complication that arises in interpreting this ritual is that X can refer to sex or gender class membership, but its not immediately clear which is intended.

Further investigation reveals several additional facts:

The designated baby examiner is specifically examining the baby's genitals.
Genital appearance is highly correlated with sex.
Genital appearance has no bearing on gender.

Let's consider the hypothesis that X refers to gender. The implication is that the baby examination procedure is fundamentally mistaken, and that generations of baby examiners around the world have been rendering gender judgements on the basis of irrelevant information. It certainly wouldn't be the first time that a cultural ritual has been found to be totally baseless, so history tells us that we shouldn't eliminate it out of hand.

How about the opposite hypothesis, that X refers to sex? This interpretation dovetails nicely with what we know: Baby examiners are collecting information about sex via a proxy (appearance of external genitalia) which is known to be reliable, and then making declarations on the basis of that information.

Both of the above interpretations are plausible, but which one is more plausible? I will submit that the "sex" hypothesis is consistent with the observed behavior, and thus is more likely to be true than the "gender" hypothesis, which is not consistent with observed behavior.

The obvious follow-ons are then "How does gender happen?" and "Why is gender correlated with sex?". Sex is assigned at birth and that gender is subsequently constructed on the basis of sex. Again, this process nicely explains observed behavior while potential alternatives do not.

Are You Still On Your Default Name?

Overhead in one of our office Slack channels:

trans people don't say "did you assume my gender"
we say "nice gender did your mom pick it"

and

"extremely funny from someone on their default name"

These were mentions, not uses, but given the speaker I believe it's safe to treat these pronouncements as capturing a certain strand of thought. Its not a sentiment that I recall having been exposed to before and so is worth writing down.

Regarding the first phrase, apparently its a meme (so minus points for originality on the part of the speaker?). The initial read on this is that its a pithy restatement of the idea that gender is assigned at birth. However, the phrase also echoes (deliberately, I assume) "Nice shirt, did your mom pick it?", the implication by analogy being that identifying as the gender you were assigned at birth is not fashionable/stylish.

The "default name" quote doesn't appear to be a meme or anything like that. The implication in the statement is that using the name you were given at birth shows... a lack of reflection, maybe? Or, again, style? In any case, it's grounds for questioning their judgement.

Taken together the quotes above express a certain... aesthetic sensibility, maybe? I find the comment about "default name" to be annoying on the grounds that I don't think names are particularly expressive. There's a big switching cost associated with changing your name and not a whole lot of benefit (outside of certain situations)... but maybe that's the point? Could name changing be a form of costly signaling?

Critiques of (not) choosing your gender, on the other hand, have some facial plausibility, but a lot really rides on the interpretation of "choose". Is "choosing" merely engaging in behaviors not typically associated with your gender, or is "choosing" only choosing if you make some sort of public declaration?

Consider: I've written elsewhere that in some of the places I've lived I've been well outside the mainstream in terms of gender presentation. At the same time, however, I've never identified as gender-nonconformant or trans or anything of that nature and think that it would stretch those terms beyond meaning were I to do so. My point is that I was deviating from expected behavior, so could be said to be "choosing my gender" in that sense. At the same time, however, I never publically identified as anything other than the gender associated with the sex I was assigned at birth, in which case that it can be argued that I wasn't making any sort of choice.

Anyhow, interesting phenomena, worth tucking away for future consideration.

Sunday, June 16, 2019

When Orders Collide

So, what happens when first-order and second-order arguments get entangled? You get a mess, that's what.

Justin Weinberg, in his recent post Trans Women and Philosphy: Learning from Recent Events, says the following in the section "Some final notes":

Please avoid first-order discussion of trans-inclusive and trans-exclusionary arguments or arguments about bathroom or prison policies and the like; I’m not interested in hosting those disputes in the comments on this post.

Which, in his defense, seems like a reasonable rule if you're trying to have a second-order discussion. However! I then recalled the following bit from the original piece by t philospher:

My gender is not up for debate. I am a woman. Any trans discourse that does not proceed from this initial assumption — that trans people are the gender that they say they are — is oppressive, regressive, and harmful.

This is followed by a "call to action" which states that contrary views should not be published, spoken, or otherwise given a platform.

It seems to me that t philosopher's position effectively couples first-order and second-order concerns, and that this is a major contributing factor to why discussions of some trans-identity-related issues have proven intractable.

Publication and speaking are the tools which philosphers have traditionally used to investigate first order problems. I publish a paper saying "X is bad", someone else publishes a rebuttal "No, X is good", someone else chimes in with "No, you're both wrong", etc. t philospher asserts that these traditional tools should be restricted, and justifies that restriction based on a first-order consideration. Which, seemingly inevitably, leads to the situation where one can't discuss which tools are appropriate without bringing up and examining first-order concerns.

I would really like to have seen Justin grapple with this aspect of the discussion more. Specifically, what are the fallbacks if part of the traditional philosophic tool suite has been proscribed? Presumably he wants to see philosophy continue as a going concern, which would seem to necessitate some viable, alternative approach.

Now, in my last post I mentioned that I'd had an epiphany. The epiphany is: This problem isn't confined to philosophy.

You can see variants on the dilemma above playing out in other places. "It's not my job to educate you", for example, is primarily a request for people to engage in self-education. But it also has the side-effect of removing a useful tool from the toolkit, specifically the ability to identify a particular individual's opinion on some topic.

More generally, any assertion that investigative tools should be limited on the basis of first-order concerns is almost certainly going to cause problems if those same tools are needed to validate the underlying concern. Having restated the problem like that, it starts to look an awful lot like a form of epistemic closure: The tools need to validate a belief are forbidden as a consequence of that same belief, thus the belief itself becomes immune to correction.

Yeah, Justin definitely needs to address that: How do we prevent t philosopher's position from leading to epistemic closure?

Interpreting And Taking Action On Requests To "Feel X"

Feelings are endogenous; that seems to be the upshot of the many and varied exhortations that we stop telling people how they should feel. No one can make you feel a particular way; at best they can create an environment conducive to a certain feeling. So if some group of people requests to feel a certain way, how is that request to be interpreted and acted upon?

Take, as a covenient and (presumably) non-contentious example, recent requests by La Salle Univerity for improved physical safety. Per the article, students are feeling a "sense of fear", and have asked administrators to implement "better security". How should the administration respond?

One way to proceed is to take steps to ensure that students actually are safe. You identify relevant measures of safety, whatever they may be, and then do whatever is needed to improve them (if necessary). At some point, presuming you execute well, students will be safe in the relevant sense. Now, setting aside the specific facts regarding La Salle (since we're not here to argue that case), suppose that the students come back and say that they still don't feel safe and that the administration needs to do more?

Let's stop and note that there are a couple of assumptions lurking in the background of the administration's initial response:

The administrators and students share an understanding of what it means to "be safe".
There's a correlation between "feeling safe" and "being safe".

What's interesting here is that there's both a normative/semantic component (shared definition) and an empiric component (correlation between feeling and being). Disagreements can arise when either assumption fails.

Were I in the administration's shoes I would tackle the empiric assumption on the hopes that its more tractable. A conversation of the form "Here's why we think you're safe."/"Here's why we still feel unsafe." might break in a few ways:

Students are persuaded they're safe.
Administrators are persuaded that more genuinely needs to be done.
Discussion reveals a shared understanding of the notion of "safety", but students and administrators cannot reach a consensus on the empirical question.
Discussion reveals a lack of shared understand of the notion of "safety".

Its easier to deal with empirical disagreements than normative disagreements. I'm possibly naive, but its seems like if you have a defensible case then making an executive decision is justified (and probably a foregone conclusion). Students gonna student and all that jazz; your life won't be easy, but thats what administrators are paid to do.

Lack of shared understanding seems like it could be a minefield, especially if the topic is more contentious than simply physical safety. I know I wouldn't want to be responsible for asking people to elaborate their beliefs in the era of "It's not my job to educate you". If asking questions is precluded then the alternative seems to pretty much be deference, at which point you're going to have a bunch of people razzing you for caving in.

Nothing discussed above is unique to educational settings; the same sort of dynamic is in play whenever there's a request that one group ensure that another group feels a certain way. And let's stop right there, I think I just had a minor epiphany. Rather than bury the lede I'll take that up in my next post.

But Wait! I Thought Everyone Rejected the Repugnant Conclusion?

Just a reminder that total happiness utilitarians are not a bogeyman; they actually exist in the wild. To wit, Torbjörn Tännsjö says:

The crucial thing is not how many people are living right now, but the sum total of happiness. Perhaps we should be fewer now to be able to go on for millions of years. Some people have a theory that we’re perhaps too many right now, and I don’t object to that. The idea is that we should be as many as possible at each point in time and go on for as long as possible. The rationale behind this is the idea that we should maximize the sum total of happiness.

That is all, you may now go about your business.

Bare Metal Management With Razor (3/N)

We got our PXE systems up and running on the Razor Microkernel. Next step is to image them!

Imaging with Razor is a two-step process:

Define some tags to classify systems.
Define some policies to image systems on the basis of their tags.

Step 1: Tags

Razor "tags" are essentially a rule-based system for classifying machines. You set up rules ahead of time, and then Razor automatically tags systems as they are discovered. For example, this rule says that any system with <2G RAM should get the 'small' tag:

[root@razor log]# razor create-tag --name small --rule '["<", ["num", ["fact", "memorysize_mb"]], 2048]'
From http://localhost:8150/api/collections/tags/small:

      name: small
      rule: ["<", ["num", ["fact", "memorysize_mb"]], 2048]
     nodes: 0
  policies: 0
   command: http://localhost:8150/api/collections/commands/1

The next time the 1G VM checks in it will have the small tag applied:

[root@razor log]# razor nodes
From http://localhost:8150/api/collections/nodes:

+-------+-------------------+--------+--------+----------------+
| name  | dhcp_mac          | tags   | policy | metadata count |
+-------+-------------------+--------+--------+----------------+
| node1 | 08:00:27:0c:fd:f4 | small  | ---    | 0              |
+-------+-------------------+--------+--------+----------------+
| node2 | 08:00:27:43:84:1d | (none) | ---    | 0              |
+-------+-------------------+--------+--------+----------------+
...

Similarly, we can define a rule that tag all systems with >2G as 'large':

[root@razor log]# razor create-tag --name large --rule '[">", ["num", ["fact", "memorysize_mb"]], 2048]'
From http://localhost:8150/api/collections/tags/large:

      name: large
      rule: [">", ["num", ["fact", "memorysize_mb"]], 2048]
     nodes: 0
  policies: 0
   command: http://localhost:8150/api/collections/commands/2

and then, the next time the 4G node checks in...

[root@razor log]# razor nodes
From http://localhost:8150/api/collections/nodes:

+-------+-------------------+-------+--------+----------------+
| name  | dhcp_mac          | tags  | policy | metadata count |
+-------+-------------------+-------+--------+----------------+
| node1 | 08:00:27:0c:fd:f4 | small | ---    | 0              |
+-------+-------------------+-------+--------+----------------+
| node2 | 08:00:27:43:84:1d | large | ---    | 0              |
+-------+-------------------+-------+--------+----------------+
...

Pretty neat, huh?

One minor shortcoming of Razor is that you can't arbitrarily tag a set of servers; tag application is entirely rule-based. This adds a little bit of complication to the common use case of "I got this bunch of servers I just brought online and I know exactly what I want to use them for". You can fake that functionality using the 'in' operator and a list of MAC addresses:

razor create-tag --name my-set-of-servers \
  --rule '["in", ["fact", "macaddress"], "de:ea:db:ee:f0:00", "de:ea:db:ee:f0:01"]'

This seems like its a popular use case, as later editions of Razor introduced the has_macaddress and has_macaddress_like operators to support this type of rule.

Step 2: Policies

The second half of setting up Razor for system imaging is to define policies, which tell Razor what to install and how it should be installed. Policies are triggered via tag matching, which automatically applies the appropriate policy to machines with a particular set of tags. For purposes of this demonstration lets assume that we want to install Ubuntu on small nodes and Centos on large nodes.

Before we can create a policy we need to identify a few things:

What collection of bits will be used to image the systems?
What are the mechanics for actually laying the bits down?
How will the handoff from Razor to a configuration management system be handled?

Bullets one and two are handled via the creation of a repository. The basic form of the command is

razor create-repo --name  --task  [ --iso-url | --url ]

One choice which needs to be made at this point is whether the Razor server is going to serve up the bits directly or simply point systems to another location. If you select --iso-url the Razor server will download the ISO and unpack it; make sure you have ample free disk space. --url will cause the Razor server to point to the specified address rather than serving up the content directly.

The other thing you need to do is specify a task, which provides Razor with the instructions on how to bootstrap the automated installation process. Task creation is somewhat involved and not for the faint-of-heart but, thankfully, Razor comes with a bunch of pre-defined tasks for common operating systems:

[root@razor ~]# razor tasks
From http://localhost:8150/api/collections/tasks:

+-----------------+----------------------------------------------------------------+---------+--------------------------------------+
| name            | description                                                    | base    | boot_seq                             |
+-----------------+----------------------------------------------------------------+---------+--------------------------------------+
| centos          | CentOS Generic Installer                                       | redhat  | 1: boot_install, default: boot_local |
+-----------------+----------------------------------------------------------------+---------+--------------------------------------+
...
+-----------------+----------------------------------------------------------------+---------+--------------------------------------+
| windows/8pro    | Microsoft Windows 8 Professional                               | windows | 1: boot_wim, default: boot_local     |
+-----------------+----------------------------------------------------------------+---------+--------------------------------------+
...

So let's set up repos for CentOS 7 and Ubuntu Xenial, since there are pre-defined tasks for both of those:

[root@razor ~]# razor create-repo --name centos-7 --task centos/7 --iso-url http://centos.s.uw.edu/centos/7.6.1810/isos/x86_64/CentOS-7-x86_64-DVD-1810.iso
From http://localhost:8150/api/collections/repos/centos-7:

     name: centos-7
  iso_url: http://centos.s.uw.edu/centos/7.6.1810/isos/x86_64/CentOS-7-x86_64-DVD-1810.iso
      url: ---
     task: centos/7
  command: http://localhost:8150/api/collections/commands/6

[root@razor ~]# razor create-repo --name ubuntu-xenial --task ubuntu/xenial --iso-url http://releases.ubuntu.com/16.04/ubuntu-16.04.6-server-amd64.iso
From http://localhost:8150/api/collections/repos/ubuntu-xenial:

     name: ubuntu-xenial
  iso_url: http://releases.ubuntu.com/16.04/ubuntu-16.04.6-server-amd64.iso
      url: ---
     task: ubuntu/xenial
  command: http://localhost:8150/api/collections/commands/11

We now have two repos, centos-7 and ubuntu-xenial, that can be referenced in policies. The razor server will download and unpack the associated ISOs in the background.

The other item we have to consider for a policy is the Razor → configuration management system handoff. Razor handles this by means of brokers, and supports several popular configuration management systems (namely Puppet and Chef) out of the box (see razor create-broker --help for a complete listing). Additionally, if you want to integrate with a different system like Salt or Ansible Razor allows you to write your own brokers.

I'm going to keep things simple and just create a no-op broker:

[root@razor ~]# razor create-broker --name=noop --broker-type=noop
From http://localhost:8150/api/collections/brokers/noop:

           name: noop
    broker_type: noop
  configuration: {}
       policies: 0
        command: http://localhost:8150/api/collections/commands/3

This type of broker doesn't try to do any sort of hand off; it's basically just a placeholder.

Alright, we've got repositories and a broker, let's create the policies:

razor create-policy --name small-nodes --repo ubuntu-xenial --broker noop --tag small --hostname 'ubuntu${id}.localdomain' --root-password not_secure
From http://localhost:8150/api/collections/policies/small-nodes:

       name: small-nodes
       repo: ubuntu-xenial
       task: ubuntu/xenial
     broker: noop
    enabled: true
  max_count:
       tags: small
      nodes: 0
    command: http://localhost:8150/api/collections/commands/12

The policy 'small-nodes' will install Ubuntu Xenial on any node with the 'small' tag. The host will be named according to its ID and have the specified root password. Doing it again for CentOS:

[root@razor etc]# razor create-policy --name large-nodes --repo centos-7 --broker noop --tag large --hostname 'centos${id}.localdomain' --root-password not_secure
From http://localhost:8150/api/collections/policies/large-nodes:

       name: large-nodes
       repo: centos-7
       task: centos/7
     broker: noop
    enabled: true
  max_count:
       tags: large
      nodes: 0
    command: http://localhost:8150/api/collections/commands/13

Same deal for the most part: Nodes with the tag 'large' will get CentOS 7 and the associated hostname.

No additional steps are needed to kick off imaging. The next time either host checks in it will have the policy applied and will start the appropriate imaging process. For example:

[root@razor ~]# razor nodes
From http://localhost:8150/api/collections/nodes:

+-------+-------------------+-------+-------------+----------------+
| name  | dhcp_mac          | tags  | policy      | metadata count |
+-------+-------------------+-------+-------------+----------------+
| node1 | 08:00:27:0c:fd:f4 | small | small-nodes | 0              |
+-------+-------------------+-------+-------------+----------------+
| node2 | 08:00:27:43:84:1d | large | ---         | 0              |
+-------+-------------------+-------+-------------+----------------+
...

node1 has completed its scheduled check-in and has had the small-nodes policy applied. If you're watching the system on console it should reboot and go into the Ubuntu installation process.

When initially working through this process I got

ipxe no configuration methods succeeded
FATAL:  INT18:  BOOT FAILURE

on console. Per the suggestion at http://ipxe.org/err/040ee1 a hard reboot temporarily solved the problem. A permanent fix, at least in the case of VirtualBox, is to disable the "Enable I/O APIC" feature for the VM.

No further intervention should be required at this point; both VMs should come up with the appropriate operating systems and host names. Here's what I ended up with:

[root@razor etc]# razor nodes
From http://localhost:8150/api/collections/nodes:

+-------+-------------------+-------+-------------+----------------+
| name  | dhcp_mac          | tags  | policy      | metadata count |
+-------+-------------------+-------+-------------+----------------+
| node1 | 08:00:27:0c:fd:f4 | small | small-nodes | 1              |
+-------+-------------------+-------+-------------+----------------+
| node2 | 08:00:27:43:84:1d | large | large-nodes | 1              |
+-------+-------------------+-------+-------------+----------------+

Note that, in addition to listing a policy, the table also shows that both VMs have some metadata defined now. Let's see what it is:

[root@razor etc]# razor nodes node1
From http://localhost:8150/api/collections/nodes/node1:

          name: node1
      dhcp_mac: 08:00:27:0c:fd:f4
         state:
                     installed: small-nodes
                  installed_at: 2019-06-07T13:39:30-07:00
                         stage: boot_local
        policy: small-nodes
  last_checkin: 2019-06-07T13:13:21-07:00
      metadata:
                  ip: 192.168.15.74
          tags: small
...

In this case the metadata lists the IP assigned to the host.

And that's what imaging with Razor looks like, modulo some configuration management stuff that I decided to elide. That's it for the present; I expect that I'll write up one more post with some concluding thoughts in the near future.

Bare Metal Management With Razor (2/N)

Having set up our Razor environment, it's now time to put it through its paces. The first order of business is to get a VM or two up and registered with the system.

Start by creating two VMs, making the following configuration tweaks:

Configure them to use the natnet1 NAT network.
Enable network boot, and put it first in boot order.
Give one VM 1G of RAM and another 4G of RAM.

Power on the 1G VM and watch its console. It should PXE boot off of the Razor/Dnsmasq infrastructure and eventually come up at a login prompt; default creds are root/thincrust.

We haven't installed an OS yet, so what is the VM running? It's running the Razor Microkernel, "a small, in-memory Linux kernel that is used by the Razor Server for dynamic, real-time discovery and inventory of the nodes that the Razor Server is managing". It's entirely ephemeral, no disk needed. And, better yet, the docs on how to modify the kernel for your own needs are pretty good. So if you want to, say, include tools for updating system firmware, or running tests/benchmarks, or anything of that nature, its not terribly hard to do.

Alright, on with the show! On the Razor server, do:

gem install razor-client

and then run razor nodes to see what's in inventory:

[root@razor log]# razor nodes
From http://localhost:8150/api/collections/nodes:

+-------+-------------------+--------+--------+----------------+
| name  | dhcp_mac          | tags   | policy | metadata count |
+-------+-------------------+--------+--------+----------------+
| node1 | 08:00:27:0c:fd:f4 | (none) | ---    | 0              |
+-------+-------------------+--------+--------+----------------+
...

There you have it... there is no step three. Alright Razor, tell me about node1:

[root@razor log]# razor nodes node1
From http://localhost:8150/api/collections/nodes/node1:

          name: node1
      dhcp_mac: 08:00:27:0c:fd:f4
         state:
                  installed: false
        policy: ---
  last_checkin: 2019-06-05T10:09:34-07:00
      metadata: ---
          tags: (none)
...

So far, so good. What sort of "facts" does the server know about the node?

[root@razor log]# razor nodes node1 facts
From http://localhost:8150/api/collections/nodes/node1:

          network_enp0s3: 192.168.15.0
              network_lo: 127.0.0.0
           system_uptime:
                            seconds: 731
                              hours: 0
                               days: 0
                             uptime: 0:12 hours
...

The server has recorded the system configuration and some other relevant items, like IP address and uptime. Now power up the 4G VM; it should do the same thing.

At this point we've achieved our first goal, which was to find a way to quickly boot up a bare metal system and allow us to do some poking around prior to making decision about OS installation. And now, a little digression.

I first got interested in bare metal management a bazillion years ago when I was working for a system integrator. We had to be able to build, inventory, and test racks of servers as efficiently as possible. Having that sort of capability is useful not only for system integrators, but anyone who has to deal with hardware at scale. As a by-product of its operation, Razor persists a bunch of readily-accessible information about hardware configuration:

[root@razor ~]# su - postgres
Last login: Fri Jun  7 10:17:54 PDT 2019 on pts/0
-bash-4.2$ psql -c 'select name, hw_info, facts from nodes' razor_prd postgres
 name  |  hw_info                                                                                         | facts
 node2 | {fact_boot_type=pcbios,mac=08-00-27-43-84-1d,serial=0,uuid=55c8c240-9c29-43cc-ab69-c55720067fa4} | {"network_enp0s3":"192.168.15.0", ...
 node1 | {fact_boot_type=pcbios,mac=08-00-27-0c-fd-f4,serial=0,uuid=f3b91803-383d-43a7-bb76-97f995cd4118} | {"network_enp0s3":"192.168.15.0", ...
(2 rows)

facts contains a JSON blob with a bunch of useful information like MAC addresses, serial numbers, disk devices, etc., suitable for post-processing/transmission to a system of record.

And we'll leave it at that for now. Next time we'll pick up with the second objective, doing some automated OS installs.

Tuesday, June 11, 2019

Bare Metal Management With Razor (1/N)

Last episode, I spent a little time messing around with Foreman, and eventually came to the conclusion that it's not quite the tool that I was looking for. Foreman wants a lot of fairly involved configuration up front, and (based on my limited experimentation) wants you to have a good idea what you're going to do with hardware ahead of time. Many other candidate systems (see my list) seem to operate under a similar paradigm. What I really want is something that will let me painlessly boot up machines and do basic hardware work (inventory/diagnostics/configuration) before making any decisions about if/how to image them.

One tool which stands out from the crowd in this regard is Razor. It provides a microkernel and some interesting PXE capabilities which let you get things up and running while deferring decisions about imaging to a later date. So it seems like a good candidate to experiment with further.

Start by building the same same base VM we used for Foreman, with the exception it only needs 1G of RAM.

Razor makes use of Postgres for data persistence, so we'll need to get that up and running as well. Here are some instruction for CentOS 7:

yum install -y postgresql-server postgresql-contrib
postgresql-setup initdb
systemctl start postgresql

And then the Razor-specific setup:

su - postgres
createuser -P razor
createdb -O razor razor_prd

The snippet above creates a user named razor and a DB named razor_prd owned by this user. This concludes the basic configuration of the Postgres DB; schema creation will follow in a bit.

Moving on, we need to install the Razor server itself. Again, here's a distillation of the official instructions:

yum install -y http://yum.puppetlabs.com/puppetlabs-release-pc1-el-7.noarch.rpm
yum install -y razor-server

So far, so good. Next we need to set up the DB schema, using the tools provided by the Razor package:

su - razor -c 'razor-admin -e production migrate-database'

The first time I did this I got

Sequel::DatabaseConnectionError: Java::OrgPostgresqlUtil::PSQLException: FATAL: Ident authentication failed for user "razor"

which indicates that something is wrong with the auth configuration for the Postgres DB. After a little Googling I found this post, which provided a fix. If you get the above error, open pg_hba.conf, wherever it may reside, and change the line which reads

host    all             all             127.0.0.1/32           ident

host    all             all             127.0.0.1/32           trust

The observant reader will note that we never set a password for the razor DB user. By default Razor expects to just be able to access the DB without a password, so the above change accommodates this requirement by making the DB trust any connection originating from localhost.

Alrighty, we should be all set. Fire up the server:

service razor-server start

Disable the firewall:

systemctl disable firewalld
iptables -F

And, on your host system, add a port forwarding rule to reach the Razor web interface:

VBoxManage natnetwork modify --netname natnet1 --port-forward-4 'razor:tcp:[127.0.0.1]:8150:[192.168.15.254]:8150'

Now, if you navigate to http://127.0.0.1:8150/api, you should get back a bunch of JSON showing the available server commands. This tells you that the Razor server is up and running and talking to the Postgres DB. This concludes installation of the Razor server proper, but there's still some work to be done to get the PXE infrastructure deployed.

First, a handful of bootstrappy things need to be put in various locations; don't think too hard about this part unless you really, really want to know the gory details. Grab the latest microkernel and put it in the appropriate location:

yum install -y wget
wget http://pup.pt/razor-microkernel-latest
tar -C /opt/puppetlabs/server/data/razor-server/repo -xf razor-microkernel-latest

Ditto the PXE script for UNDI systems:

wget -O /var/lib/tftpboot/undionly.kpxe http://boot.ipxe.org/undionly.kpxe

Add a line to /etc/hosts which will allow Razor to generate a bootstrap script:

192.168.15.254 razor.localdomain razor

and then call the Razor API to generate it:

wget -O /var/lib/tftpboot/bootstrap.ipxe http://razor.localdomain:8150/api/microkernel/bootstrap

This concludes the mindless copying of the aforementioned bootstrappy things... back to the interesting bits.

Now here's a bit of a complication that we haven't had to deal with before. 'razor.localdomain' gets embedded into the bootstrap script, which means it needs to be resolvable by client systems. Usually, when experimenting, you can hack around this by adding appropriate entries to /etc/hosts, but since there's no equivalent to /etc/hosts in the PXE environment that won't work. Instead, razor.localdomain will have to be genuinely resolvable via DNS, which which means we have to stand up some sort of DNS server.

I don't want to set up BIND, or any of the other enterprise-grade servers, for something as simple as providing DNS service for a single subnet. The PXE/DHCP/TFTP setup docs for Razor provide info on configuring Dnsmasq which, incidentally, can also be used for DNS:

Dnsmasq is a lightweight, easy to configure DNS forwarder, designed to provide DNS (and optionally DHCP and TFTP) services to a small-scale network. It can serve the names of local machines which are not in the global DNS.

Dnsmasq is an example of super awesome design. It has a bunch of really smart defaults, like reading records from /etc/hosts and setting up server forwarding on the basis of /etc/resolv.conf. It basically just Does The Right Thing™. So let's get Dnsmasq installed and configured:

yum install -y dnsmasq

Create a file /etc/dnsmasq.conf and paste in the configuration from the Razor docs:

dhcp-match=IPXEBOOT,175
dhcp-boot=net:IPXEBOOT,bootstrap.ipxe
dhcp-boot=undionly.kpxe
# TFTP setup
enable-tftp
tftp-root=/var/lib/tftpboot

We also need to specify the network configuration for DHCP:

dhcp-range=enp0s3,192.168.15.2,192.168.15.253,4h
dhcp-option=3,192.168.15.1

The first line says that requests received from internface enp0s3 will get IPs in the range of 192.168.15.2 - 192.168.15.253 with a lease time of 4 hours. The second line sets the default gateway to 192.168.15.1.

That concludes configuration of Dnsmaq. Once that's done, start it up:

service dnsmasq start

Ok, how are we looking?

[root@razor ~]# dig @127.0.0.1 razor.localdomain | grep -v '^;'

razor.localdomain.	0	IN	A	192.168.15.254

Bueno!

Alright, to recap what we did, since this was more involved than usual:

Set up Postgres, and create a DB and user for use by Razor.
Install Razor, and then use the provided utilities to set up the DB schema.
Put the materials in place to support PXE boot.
Install and configure Dnsmasq, which will provide PXE/DHCP, TFTP, and DNS services for our tiny little subnet.

Next time we'll use this collection of infrastructure to PXE boot a couple of VMs.