Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pillar does not handle Unicode data #3436

Closed
madduck opened this issue Jan 25, 2013 · 10 comments
Closed

Pillar does not handle Unicode data #3436

madduck opened this issue Jan 25, 2013 · 10 comments
Labels
Bug broken, incorrect, or confusing behavior P1 Priority 1 Pillar Platform Relates to OS, containers, platform-based utilities like FS, system based apps severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Milestone

Comments

@madduck
Copy link
Contributor

madduck commented Jan 25, 2013

My pillar data is supplied by cmd_yaml, and some of it is unicoded (UTF-8). I have used pdb and printf-style debugging to verify that the ext_pillar function in salt/pillar/cmd_yaml.py returns those data properly as unicode() type.

I want to use these data in a Jinja2 template. Unfortunately (again using pdb and printf-style debugging), the data passed as "Context" to the Jinja2 processor are no longer unicoded.

This means that somewhere within Salt, between ext_pillar and instantiating a Jinja2 template as part of file.managed, the data are converted from Unicode to ASCII, without a conversion error, meaning that the data are squashed to an ASCII-representation. And indeed, the character '…' is stored as "\xe2\x80\xa6", not as "\u2026" as it should.

@tempusfrangit
Copy link
Contributor

This appears that this is related to msgpack and how msgpack handles the unicode objects. There is an active thread over here on github: msgpack/msgpack#121

Since the data is msgpacked on the master (dumps) before being sent to the minions, and then unpacked (loads) the unicode data type is lost and we get a byte_str back. I am not sure what the best approach to fixing this problem.

I think the wrong approach would be decoding everything to utf-8 (sub-optimal and require running through the data structure and looking for any possible strings to decode).

@thatch45 do you have any insight on the best approach? Maybe build a map that tracks any unicode items in pillar (and possibly elsewhere) and then just decodes them directly on the minion's side. Downside is this requires (again) running through all the data to find unicode to build the mapping.

Confirmed via python cli here:

before_msgpack = {'INT': 1, 'str': 'STRING', 'UNICODE': u'\u2026'}

before_msgpack

{'INT': 1, 'str': 'STRING', 'UNICODE': u'\u2026'}

packed = msgpack.dumps(before_msgpack)
after_msgpack = msgpack.loads(packed)

after_msgpack
{'INT': 1, 'STR': 'STRING', 'UNICODE': '\xe2\x80\xa6'}

@thatch45
Copy link
Contributor

I am less then excited about this one, I will contact the message pack guys....

@torhve
Copy link
Contributor

torhve commented Feb 28, 2013

Would not one solution to this problem be to force everything in salt to to be UTF-8?

This is how it works now:

>>> msgpack.loads(msgpack.dumps([1, u'Ø', 'ascii']))
(1, '\xc3\x98', 'ascii')

With forcing the data would look like this:

 >>> msgpack.loads(msgpack.dumps([1, u'Ø', 'ascii']), encoding='UTF-8')
(1, u'\xd8', u'ascii')

@torhve
Copy link
Contributor

torhve commented Mar 7, 2013

Ran into a problem again today with this bug.
@thatch45 is this on your radar?

@sebw
Copy link
Contributor

sebw commented Jul 1, 2013

I'm managing my DNS authoritative servers through Salt.

Since July 11 .be domain names can contain accents such as é à è, etc.

The pillar :

dns-public:
  master:
    ns01
  master-ip:
    "x.x.x.x;"
  slave-ip:
    "x.x.x.x;"
  allow-transfer:
    "x.x.x.x;"
  domain:
    - "\u00e9xample.org"

The state :

{% if pillar['dns-public']['master'] == grains['nodename'] %}
{% for domain in pillar['dns-public']['domain'] %}
/var/named/data/{{ domain }}.hosts:
  file:
    - managed
    - source: salt://dns-public/template.hosts
    - user: named
    - group: named
    - mode: 0664
    - template: jinja
    - backup: minion
    - replace: False
    - context:
      domain: {{ domain }}
      serial: {{ 2010123101 }}
    - require:
      - file: /var/named/data
{% endfor %}
{% endif %}

The source :

$ORIGIN .
$TTL 600
{{ domain }}      IN      SOA     ns01.x. dnsadmin.x. (
                        {{ serial }}
                        10800
                        3600
                        604800
                        86400 )
                        NS ns01.x.
                        NS ns02.x.
                        A x.x.x.x
www.{{ domain }}. IN CNAME {{ domain }}.

salt 'ns*' pillar.data

dns-public:
        ----------
        allow-transfer:
            x.x.x.x;
        domain:
            - éxample.org

salt 'ns*' state.highstate :

   State: - file
    Name:      /etc/named/domain.conf
    Function:  managed
        Result:    False
        Comment:   Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/salt/utils/templates.py", line 55, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python2.6/site-packages/salt/utils/templates.py", line 98, in render_jinja_tmpl
    output = jinja_env.from_string(tmplstr).render(**context)
  File "/usr/lib64/python2.6/site-packages/jinja2/environment.py", line 669, in render
    return self.environment.handle_exception(exc_info, True)
  File "<template>", line 11, in top-level template code
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128

thatch45 added a commit that referenced this issue Jul 2, 2013
Use an unicode aware context in the jinja render call. Refs #3436
@rallytime rallytime added Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists labels Oct 1, 2014
@alexmorozov
Copy link

Maybe I post to the wrong thread, but this issue comes up whenever I google for pillar and unicode problems. Anyway.
The current workaround (as of salt 2014.01.13) is to use yaml_utf8 setting in master config, and to force unicode conversion of variables, like {{ pillar.unicode.variable.decode('utf-8') }}. At least it works for us.
Hope this helps someone.

@basepi
Copy link
Contributor

basepi commented Oct 31, 2014

Awesome, thanks for the workaround, @alexmorozov!

@jfindlay jfindlay added severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P1 Priority 1 Platform Relates to OS, containers, platform-based utilities like FS, system based apps Pillar and removed severity-low 4th level, cosemtic problems, work around exists labels Jun 10, 2015
@ghost
Copy link

ghost commented Oct 10, 2015

so,does the {{ pillar.unicode.variable.decode('utf-8') }} configure works @basepi ? how to configure {{ pillar.unicode.variable.decode('utf-8') }} in sls files? can u show me an example of configuration ? thx a lot @alexmorozov

bernieke added a commit to Awingu/salt that referenced this issue Oct 20, 2015
@bernieke
Copy link
Contributor

I've created a pull request fixing this without the need of yaml_utf8 or decode (on top of 2015.8.1 and whatever fixes that already carries.)

cachedout pushed a commit that referenced this issue Oct 20, 2015
fix unicode pillar values #3436
@basepi
Copy link
Contributor

basepi commented Oct 20, 2015

Awesome @bernieke! That fix has been merged, so I'm going to close this. It will be in 2015.8.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior P1 Priority 1 Pillar Platform Relates to OS, containers, platform-based utilities like FS, system based apps severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Projects
None yet
Development

No branches or pull requests

10 participants