> For the complete documentation index, see [llms.txt](https://chameleoncloud.gitbook.io/chi-in-a-box/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://chameleoncloud.gitbook.io/chi-in-a-box/operations/runbooks/ironicnodeinerrorstate.md).

# Ironic Node Error State

**Summary**: an Ironic node has entered the `error` [provision state](https://docs.openstack.org/ironic/latest/contributor/states.html). Per the docs:

> This is the state a node will move into when deleting an active deployment fails.

**Consequences**: users will not be able to launch instances on these Ironic nodes. However, they will still be able to reserve the nodes, which can lead to confusion when trying to utilize the reservation.

#### Possible causes

**Temporary IPMI connectivity disruption**: In some cases, the power status of the node cannot be synced during a deployment or undeployment, and the node can enter an error state as a precaution. There is a [hammer](https://github.com/ChameleonCloud/hammers/blob/master/hammers/scripts/ironic_error_resetter.py) that should attempt to "reset" this state, as it can and does happen periodically simply due to network contention or interruption on the provisioning network.

1. Check the "extra" field on the node: `openstack baremetal node show $node -f json | jq .extra`. A node that has been reset by the hammer will have a "hammer\_error\_resets" key with timestamps for each time a reset was performed.
2. If there are more than [`max_attempts`](https://github.com/ChameleonCloud/hammers/blob/master/hammers/scripts/ironic_error_resetter.py#L146) (3 at time of writing), then this node could have an issue with its IPMI interface and should be put into maintenance.

**Temporary API connectivity disruption**: Many OpenStack services are involved in instance tear-down (e.g., Keystone, Nova, Ironic, Neutron)--if any of those cannot be reached, the instance can fail to tear down.

**IPMI interface failure**: If the node has a pattern of issues with IPMI, there could be an issue with the BMC, the IPMI NIC, or even the physical cable or connection on the switch that provides IPMI connectivity. All of these issues require maintenance of the node.

#### Clearing the error state

To put the node back into the `available` state, you can trigger an `undeploy` of the node. This works even if the node doesn't have an instance; it essentially performs a `clean` and then `delete` if there is an instance, then resets the state.

```shell
openstack baremetal node undeploy $node
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://chameleoncloud.gitbook.io/chi-in-a-box/operations/runbooks/ironicnodeinerrorstate.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
