ansible-lint にPull Request送ってみた

はじめに
ansible-lint と私
- (余談) good first issue の活用
調査の一部始終
プルリク送ってみた
- (余談) テスト駆動開発
おわりに
References

はじめに

こんにちは、最近パンを焼いている者です。

これがこう pic.twitter.com/4y7mJYHmwV

— nishipy (@iamnishipy) October 16, 2022

これは「Ansible Advent Calendar 2022」 7日目の記事です。Advent Calendarがあるとブログ書く気になるので、ありがたいイベントです。

Kubernetes Advent Calendarにも参加したので、よければご覧ください〜。

kubeletのPLEGをソースコード読んで理解したかった – Kubernetes Advent Calendar 6日目

ansible-lint と私

ansible-lintは、その名の通りAnsibleのlinterです。PlaybookやRoleがコード規約に則って書けているかチェックできます。コード規約は、rules として独自に定義することもできますし、一般的なものはデフォルトルールとしてあらかじめ定義されています。

某社が提供するRed Hat Ansible Automation Platform (AAP)でも、AAP 2.2からansible-lintがTech Previewとして含まれるようになりました。

What’s new in Ansible Automation Platform 2.2 | Red Hat

2022年は私にとってコードを書くタイプのOSSコントリビュートを頑張った1年でした。そのうちansible-lintのバグを修正したことについて共有させてください。なお、記事の中で紹介するバグ修正がansible-lint v6.1.0で取り込まれたため、載せるソースコードはその直前の ansible-lint v6.0.2 時点のものです。ご注意ください。

私がansible-lintで初めて修正したバグで、かつ一番大きかったのがこちらです。デフォルトルールの var-naming のバグです。変数名がコード規約に従っているかチェックするルールですが、block 内で宣言した変数に対してうまく動いませんでした。根拠なく、なんかいけそうな気がしたので、こちらに取り組むことにしました。

Variable naming standards different for tasks and blocks · Issue #1895 · ansible/ansible-lint

Summary Using uppercase variables in tasks are reported as `var-naming: Task defines variables within 'vars' section that violates variable naming standards´. T...(続く)

(余談) good first issue の活用

コントリビュートしてみたいけど何から始めればいいかわからない場合、good first issue を活用するのもよさそうです。Issueのラベルではなくそういう名前のWebサイトです。

Good First Issue: Make your first open-source contribution

Making your first open-source contribution is easier than you think. Good First Issue is a curated list of issues from popular open-source projects that you can...(続く)

人気のあるプロジェクトや新たなコントリビューターを歓迎しているプロジェクトのIssueから、good first issueラベルがついた比較的取り組みやすいIssueをまとめてくれています。言語ごとにフィルターもできて便利です。

調査の一部始終

ansiblelint._internal.rules.BaseRule.getmatches()

anbsible-lintでルール違反を検出するのが、BaseRule クラスのgetmatches()メソッドです。

# ~/src/ansiblelint/_internal/rules.py

def getmatches(self, file: "Lintable") -> List["MatchError"]:

"""Return all matches while ignoring exceptions."""

matches = []

if not file.path.is_dir():

for method in [self.matchlines, self.matchtasks, self.matchyaml]:

try:

matches.extend(method(file))

except Exception as exc: # pylint: disable=broad-except

_logger.debug(

"Ignored exception from %s.%s: %s",

self.__class__.__name__,

method,

exc,

)

else:

matches.extend(self.matchdir(file))

return matches

for method in [self.matchlines, self.matchtasks, self.matchyaml] のところで、すべての評価用のメソッドを順に読んでいき、最後にまとめて結果を返しています。

ansiblelint.rules.match_types

以下のようなメソッドが評価に使われます。match, matchtask, matchplay はそれぞれmatchlines、matchtasks、matchyamlから呼び出される構造のようです。今回取り組んだIssueは、tasks (およびpre_tasks/post_tasks)の中のblock内の変数名をうまく評価できていないという問題なので、matchtasks/matchtaskを見るのが良さそうです。

# ~/src/ansiblelint/rules/__init__.py

match_types = {

"matchlines": "line",

"match": "line", # called by matchlines

"matchtasks": "task",

"matchtask": "task", # called by matchtasks

"matchyaml": "yaml",

"matchplay": "play", # called by matchyaml

"matchdir": "dir",

}

AnsibleLintRule.matchtasks()

個々のルールは、このAnsibleLintRule classをオーバーライドしており、matchtaskだけを独自に実装しています。そのため、呼び出し元のmatchtasksは共通と思って良さそうです。

# ~/src/ansiblelint/rules/__init__.py

class AnsibleLintRule(BaseRule):

"""AnsibleLintRule should be used as base for writing new rules."""

# ~~

def matchtasks(self, file: Lintable) -> List[MatchError]:

matches: List[MatchError] = []

if (

not self.matchtask

or file.kind not in ["handlers", "tasks", "playbook"]

or str(file.base_kind) != "text/yaml"

return matches

tasks_iterator = ansiblelint.yaml_utils.iter_tasks_in_file(file, self.id)

for raw_task, task, skipped, error in tasks_iterator:

if error is not None:

# normalize_task converts AnsibleParserError to MatchError

return [error]

if skipped or "action" not in task:

continue

if self.needs_raw_task:

task["__raw_task__"] = raw_task

result = self.matchtask(task, file=file)

if not result:

continue

message = None

if isinstance(result, str):

message = result

task_msg = "Task/Handler: " + ansiblelint.utils.task_to_str(task)

match = self.create_matcherror(

message=message,

linenumber=task[ansiblelint.utils.LINE_NUMBER_KEY],

details=task_msg,

filename=file,

)

match.task = task

matches.append(match)

return matches

ansiblelint.rules.VariableNamingRule

var-namingルールのバグということなので、src/ansiblelint/rules/var_naming.py の matchtask() を見てみます。

# ~/src/ansiblelint/rules/var_naming.py

class VariableNamingRule(AnsibleLintRule):

"""All variables should be named using only lowercase and underscores."""

id = "var-naming"

severity = "MEDIUM"

tags = ["idiom", "experimental"]

version_added = "v5.0.10"

re_pattern = re.compile(options.var_naming_pattern or "^[a-z_][a-z0-9_]*$")

# ~~

def matchtask(

self, task: Dict[str, Any], file: Optional[Lintable] = None

) -> Union[bool, str]:

"""Return matches for task based variables."""

# If the task uses the 'vars' section to set variables

our_vars = task.get("vars", {})

for key in our_vars.keys():

if self.is_invalid_variable_name(key):

return "Task defines variables within 'vars' section that violates variable naming standards"

# If the task uses the 'set_fact' module

ansible_module = task["action"]["__ansible_module__"]

ansible_action = task["action"]

if ansible_module == "set_fact":

for key in ansible_action.keys():

if self.is_invalid_variable_name(key):

return "Task uses 'set_fact' to define variables that violates variable naming standards"

# If the task registers a variable

registered_var = task.get("register", None)

if registered_var and self.is_invalid_variable_name(registered_var):

return "Task registers a variable that violates variable naming standards"

return False

task というデータをインプットとして使い、our_vars = task.get("vars", {}) のところで変数を抜き出してきています。

いろいろデバッグしていると、そもそもmatchtaskのインプットとなるデータから、blockの情報が欠落しているというのがわかりました。従って、インプットのデータを作るところ再度確認するため、呼び出し元のAnsibleLintRule.matchtasks()メソッドに話を戻します。

再び、AnsibleLintRule.matchtasks()

# ~/src/ansiblelint/rules/__init__.py

class AnsibleLintRule(BaseRule):

"""AnsibleLintRule should be used as base for writing new rules."""

# ~~

def matchtasks(self, file: Lintable) -> List[MatchError]:

matches: List[MatchError] = []

if (

not self.matchtask

or file.kind not in ["handlers", "tasks", "playbook"]

or str(file.base_kind) != "text/yaml"

return matches

tasks_iterator = ansiblelint.yaml_utils.iter_tasks_in_file(file, self.id)

for raw_task, task, skipped, error in tasks_iterator:

if error is not None:

# normalize_task converts AnsibleParserError to MatchError

return [error]

if skipped or "action" not in task:

continue

if self.needs_raw_task:

task["__raw_task__"] = raw_task

result = self.matchtask(task, file=file)

さて、もう一度matchtasksを見てみると、result = self.matchtask(task, file=file) のところで、matchtaskを呼んでいます。このとき引数となっているtask の作り方が怪しそうです。

matchtaskはfor文の中で呼ばれていますが、そのイテレータは tasks_iterator = ansiblelint.yaml_utils.iter_tasks_in_file(file, self.id) で生成されているのがわかります。怪しい。

ansiblelint.yaml_utils.iter_tasks_in_file()

# ~/src/ansiblelint/yaml_utils.py

def iter_tasks_in_file(

lintable: Lintable, rule_id: str

) -> Iterator[Tuple[Dict[str, Any], Dict[str, Any], bool, Optional[MatchError]]]:

"""Iterate over tasks in file.

This yields a 4-tuple of raw_task, normalized_task, skipped, and error.

raw_task:

When looping through the tasks in the file, each "raw_task" is minimally

processed to include these special keys: __line__, __file__, skipped_rules.

normalized_task:

When each raw_task is "normalized", action shorthand (strings) get parsed

by ansible into python objects and the action key gets normalized. If the task

should be skipped (skipped is True) or normalizing it fails (error is not None)

then this is just the raw_task instead of a normalized copy.

skipped:

Whether or not the task should be skipped according to tags or skipped_rules.

error:

This is normally None. It will be a MatchError when the raw_task cannot be

normalized due to an AnsibleParserError.

:param lintable: The playbook or tasks/handlers yaml file to get tasks from

:param rule_id: The current rule id to allow calculating skipped.

:return: raw_task, normalized_task, skipped, error

"""

data = parse_yaml_linenumbers(lintable)

if not data:

return

data = ansiblelint.skip_utils.append_skipped_rules(data, lintable)

raw_tasks = get_action_tasks(data, lintable)

for raw_task in raw_tasks:

err: Optional[MatchError] = None

# An empty `tags` block causes `None` to be returned if

# the `or []` is not present - `task.get('tags', [])`

# does not suffice.

skipped_in_task_tag = "skip_ansible_lint" in (raw_task.get("tags") or [])

skipped_in_yaml_comment = rule_id in raw_task.get("skipped_rules", ())

skipped = skipped_in_task_tag or skipped_in_yaml_comment

if skipped:

yield raw_task, raw_task, skipped, err

continue

try:

normalized_task = normalize_task(raw_task, str(lintable.path))

except MatchError as err:

# normalize_task converts AnsibleParserError to MatchError

yield raw_task, raw_task, skipped, err

return

yield raw_task, normalized_task, skipped, err

iter_tasks_in_file() は、yaml_utilの中にある関数ですが、YAMLファイル内のtaskをイテレートします。具体的にはraw_task, normalized_task, skipped, error のタプルを生成するようです。

サンプルのPlaybookを作って地道にデバッグしてみると、raw_tasks = get_action_tasks(data, lintable) でdataをtaskの形に変換する際に、以下のようにblockの情報が捨てられているのがわかりました。

---

- hosts: 'localhost'

gather_facts: true

tasks:

- name: 'test in a block'

block:

- name: set fact

ansible.builtin.set_fact:

foo: "{{ BAD }}"

vars:

BAD: false

DATA: [{'hosts': 'localhost', 'gather_facts': True, 'tasks': [{'name': 'test in a block', 'block': [{'name': 'set fact', 'ansible.builtin.set_fact': {'foo': '{{ BAD }}', '__line__': 9, '__file__': PosixPath('../test/ansible-lint-test/test.yaml')}, '__line__': 7, '__file__': PosixPath('../test/ansible-lint-test/test.yaml'), 'skipped_rules': []}], 'vars': {'BAD': False, '__line__': 11, '__file__': PosixPath('../test/ansible-lint-test/test.yaml')}, '__line__': 5, '__file__': PosixPath('../test/ansible-lint-test/test.yaml'), 'skipped_rules': [], '__ansible_action_type__': 'task'}], '__line__': 2, '__file__': PosixPath('../test/ansible-lint-test/test.yaml')}]

RAWTASKS: [{'name': 'set fact', 'ansible.builtin.set_fact': {'foo': '{{ BAD }}', '__line__': 9, '__file__': PosixPath('../test/ansible-lint-test/test.yaml')}, '__line__': 7, '__file__': PosixPath('../test/ansible-lint-test/test.yaml'), 'skipped_rules': [], '__ansible_action_type__': 'meta'}]

ansiblelint.utils.get_action_tasks()

# ~/src/ansiblelint/utils.py

def get_action_tasks(data: AnsibleBaseYAMLObject, file: Lintable) -> List[Any]:

"""Get a flattened list of action tasks from the file."""

tasks = []

if file.kind in ["tasks", "handlers"]:

tasks = add_action_type(data, file.kind)

else:

tasks.extend(extract_from_list(data, PLAYBOOK_TASK_KEYWORDS))

# Add sub-elements of block/rescue/always to tasks list

tasks.extend(extract_from_list(tasks, NESTED_TASK_KEYS, recursive=True))

# Remove block/rescue/always elements from tasks list

tasks[:] = [task for task in tasks if all(k not in task for k in NESTED_TASK_KEYS)]

# Include the FQCN task names as this happens before normalize

return [

task

for task in tasks

if set(

[

"include",

"include_tasks",

"import_playbook",

"import_tasks",

"ansible.builtin.include",

"ansible.builtin.include_tasks",

"ansible.builtin.import_playbook",

"ansible.builtin.import_tasks",

]

).isdisjoint(task.keys())

]

ちゃんとコメントで # Remove block/rescue/always elements from tasks list と書いてましたw。経緯はわからないですが、直前の# Add sub-elements of block/rescue/always to tasks listでblock/rescue/always内のネストされたタスクを取り出してフラットにしているので、不要なblock/rescue/alwaysの要素を消しているんでしょうか。このときにblock/rescue/alwaysレベルで定義した変数は消えてしまっているので、var-namingルールでうまく扱えなかったようです。

プルリク送ってみた

前章のような調査以外にもいろいろ試しつつ、なんとか修正してPull Requestを送りました。最終的な変更内容はこちらです。別のルールのバグもついでに修正したり、テストコード追加したり、いろいろできました。普段コード書くことそんなに多くないのでとても楽しかったです(小並感)。最初は見当違いな実装をしていた上テストを書いておらず、今考えれば散々な内容でしたが、レビュアーの方が親切で助かりました。

Make sure all tasks get evaluated by matchtask including block/always/rescue and nested tasks by nishipy · Pull Request #2031 · ansible/ansible-lint

Fix #1895.

(余談) テスト駆動開発

今回の修正を行なったおかげで、他のルールの実装をリファクタリングすることもできました。リファクタリング前に不足していたテストを書いたり、リファクタリング後に漏れていたテストケースを追加したりしました。
リファクタリングのPull Requestもなんとかマージしてもらい、テストコードって大切だなあと感じたので、後日こちらの書籍を読みました。そして、これ理解してなくてごめんなさいという気持ちになりました。

おわりに

何度も心が折れそうになりましたが、無事マージしてもらえてよかったです。普段使っているツールに自分で書いたコードが含まれているってなんかロマンありますよね。来年も何かしらコミュニティに貢献できるよう頑張ります。

はじめに

ansible-lint と 私

(余談) good first issue の活用

調査の一部始終

ansiblelint._internal.rules.BaseRule.getmatches()

ansiblelint.rules.match_types

AnsibleLintRule.matchtasks()

ansiblelint.rules.VariableNamingRule

再び、AnsibleLintRule.matchtasks()

ansiblelint.yaml_utils.iter_tasks_in_file()

ansiblelint.utils.get_action_tasks()

プルリク送ってみた

(余談) テスト駆動開発

おわりに

References

コメント

ansible-lint と私