Skip to content

Commit

Permalink
Merge pull request #66 from tableau/development
Browse files Browse the repository at this point in the history
Releasing 0.2 to master
  • Loading branch information
Russell Hay authored Jul 26, 2016
2 parents e6a0bba + aa93eef commit 59ac21c
Show file tree
Hide file tree
Showing 32 changed files with 1,407 additions and 90 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,4 @@ target/

#Other things
.DS_Store
.idea
5 changes: 3 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,11 @@ install:
# command to run tests
script:
# Tests
- python test.py
- python setup.py test
# pep8
- pep8 --ignore=E501 .
- pep8 .
# Examples
- (cd "Examples/Replicate Workbook" && python replicateWorkbook.py)
- (cd "Examples/List TDS Info" && python listTDSInfo.py)
- (cd "Examples/GetFields" && python show_fields.py)

11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## 0.2 (22 July 2016)

* Added support for loading twbx and tdsx files (#43, #44)
* Added Fields property to datasource (#45)
* Added Example for using the Fields Property (#51)
* Added Ability to get fields used by a specific sheet (#54)
* Code clean up and test reorganization

## 0.1 (29 June 2016)

* Initial Release to the world
1 change: 1 addition & 0 deletions Examples/GetFields/World.tds
29 changes: 29 additions & 0 deletions Examples/GetFields/show_fields.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
############################################################
# Step 1) Use Datasource object from the Document API
############################################################
from tableaudocumentapi import Datasource

############################################################
# Step 2) Open the .tds we want to inspect
############################################################
sourceTDS = Datasource.from_file('World.tds')

############################################################
# Step 3) Print out all of the fields and what type they are
############################################################
print('----------------------------------------------------------')
print('--- {} total fields in this datasource'.format(len(sourceTDS.fields)))
print('----------------------------------------------------------')
for count, field in enumerate(sourceTDS.fields.values()):
print('{:>4}: {} is a {}'.format(count+1, field.name, field.datatype))
blank_line = False
if field.calculation:
print(' the formula is {}'.format(field.calculation))
blank_line = True
if field.default_aggregation:
print(' the default aggregation is {}'.format(field.default_aggregation))
blank_line = True

if blank_line:
print('')
print('----------------------------------------------------------')
52 changes: 37 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,26 @@ This repo contains Python source and example files for the Tableau Document API.

Document API
---------------
The Document API provides a supported way to programmatically make updates to Tableau workbook (`.twb`) and datasource (`.tds`) files. If you've been making changes to these file types by directly updating the XML--that is, by XML hacking--this SDK is for you :)

Currently only the following operations are supported:

- Modify database server
- Modify database name
- Modify database user

We don't yet support creating files from scratch. In addition, support for `.twbx` and `.tdsx` files is coming.
The Document API provides a supported way to programmatically make updates to Tableau workbook and data source files. If you've been making changes to these file types by directly updating the XML--that is, by XML hacking--this SDK is for you :)

Features include:
- Support for 9.X, and 10.X workbook and data source files
- Including TDSX and TWBX files
- Getting connection information from data sources and workbooks
- Server Name
- Username
- Database Name
- Authentication Type
- Connection Type
- Updating connection information in workbooks and data sources
- Server Name
- Username
- Database Name
- Getting Field information from data sources and workbooks
- Get all fields in a data source
- Get all feilds in use by certain sheets in a workbook

We don't yet support creating files from scratch, adding extracts into workbooks or data sources, or updating field information


###Getting Started
Expand All @@ -34,8 +45,19 @@ Download the `.zip` file that contains the SDK. Unzip the file and then run the
pip install -e <directory containing setup.py>
```

We plan on putting the package in PyPi to make installation easier.
#### Installing the Development Version From Git

*Only do this if you know you want the development version, no guarantee that we won't break APIs during development*

```text
pip install git+https://github.com/tableau/document-api-python.git@development
```

If you go this route, but want to switch back to the non-development version, you need to run the following command before installing the stable version:

```text
pip uninstall tableaudocumentapi
```

###Basics
The following example shows the basic syntax for using the Document API to update a workbook:
Expand All @@ -52,7 +74,7 @@ sourceWB.datasources[0].connections[0].username = "benl"
sourceWB.save()
```

With Data Integration in Tableau 10, a datasource can have multiple connections. To access the connections simply index them like you would datasources
With Data Integration in Tableau 10, a data source can have multiple connections. To access the connections simply index them like you would datasources

```python
from tableaudocumentapi import Workbook
Expand All @@ -75,13 +97,13 @@ sourceWB.save()
**Notes**

- Import the `Workbook` object from the `tableaudocumentapi` module.
- To open a workbook, instantiate a `Workbook` object and pass the `.twb` file name in the constructor.
- The `Workbook` object exposes a `datasources` collection.
- Each datasource object has a `connection` object that supports a `server`, `dbname`, and `username` property.
- To open a workbook, instantiate a `Workbook` object and pass the file name as the first argument.
- The `Workbook` object exposes a list of `datasources` in the workbook
- Each data source object has a `connection` object that supports a `server`, `dbname`, and `username` property.
- Save changes to the workbook by calling the `save` or `save_as` method.



###Examples

The downloadable package contains an example named `replicateWorkbook.py` (in the folder `\Examples\Replicate Workbook`). This example reads an existing workbook and reads a .csv file that contains a list of servers, database names, and users. For each new user in the .csv file, the code copies the original workbook, updates the `server`, `dbname`, and `username` properties, and saves the workbook under a new name.
The downloadable package contains several example scripts that show more detailed usage of the Document API
33 changes: 33 additions & 0 deletions contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Contributing

We welcome contributions to this project!

Contribution can include, but are not limited to, any of the following:

* File an Issue
* Request a Feature
* Implement a Requested Feature
* Fix an Issue/Bug
* Add/Fix documentation

Contributions must follow the guidelines outlined on the [Tableau Organization](http://tableau.github.io/) page, though filing an issue or requesting
a feature do not require the CLA.

## Issues and Feature Requests

To submit an issue/bug report, or to request a feature, please submit a [github issue](https://github.com/tableau/document-api-python/issues) to the repo.

If you are submiting a bug report, please provide as much information as you can, including clear and concise repro steps, attaching any necessary
files to assist in the repro. **Be sure to scrub the files of any potentially sensitive information. Issues are public.**

For a feature request, please try to describe the scenario you are trying to accomplish that requires the feature. This will help us understand
the limitations that you are running into, and provide us with a use case to know if we've satisfied your request.

## Fixes, Implementations, and Documentation

For all other things, please submit a PR that includes the fix, documentation, or new code that you are trying to contribute. More information on
creating a PR can be found in the [github documentation](https://help.github.com/articles/creating-a-pull-request/)

If the feature is complex or has multiple solutions that could be equally appropriate approaches, it would be helpful to file an issue to discuss the
design trade-offs of each solution before implementing, to allow us to collectively arrive at the best solution, which most likely exists in the middle
somewhere.
9 changes: 9 additions & 0 deletions publish.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/usr/bin/env bash

set -e

rm -rf dist
python setup.py sdist
python setup.py bdist_wheel
python3 setup.py bdist_wheel
twine upload dist/*
10 changes: 10 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[wheel]
universal = 1

[pycodestyle]
select =
max_line_length = 120

[pep8]
max_line_length = 120

5 changes: 3 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,12 @@

setup(
name='tableaudocumentapi',
version='0.0.1',
version='0.2',
author='Tableau Software',
author_email='[email protected]',
url='https://github.com/tableau/document-api-python',
packages=['tableaudocumentapi'],
license='MIT',
description='A Python module for working with Tableau files.'
description='A Python module for working with Tableau files.',
test_suite='test'
)
2 changes: 2 additions & 0 deletions tableaudocumentapi/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
from .field import Field
from .connection import Connection
from .datasource import Datasource, ConnectionParser
from .workbook import Workbook

__version__ = '0.0.1'
__VERSION__ = __version__
95 changes: 89 additions & 6 deletions tableaudocumentapi/datasource.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,67 @@
# Datasource - A class for writing datasources to Tableau files
#
###############################################################################
import collections
import itertools
import xml.etree.ElementTree as ET
from tableaudocumentapi import Connection
import xml.sax.saxutils as sax

from tableaudocumentapi import Connection, xfile
from tableaudocumentapi import Field
from tableaudocumentapi.multilookup_dict import MultiLookupDict
from tableaudocumentapi.xfile import xml_open

class ConnectionParser(object):
########
# This is needed in order to determine if something is a string or not. It is necessary because
# of differences between python2 (basestring) and python3 (str). If python2 support is every
# dropped, remove this and change the basestring references below to str
try:
basestring
except NameError:
basestring = str
########

_ColumnObjectReturnTuple = collections.namedtuple('_ColumnObjectReturnTupleType', ['id', 'object'])


def _get_metadata_xml_for_field(root_xml, field_name):
if "'" in field_name:
field_name = sax.escape(field_name, {"'": "&apos;"})
xpath = ".//metadata-record[@class='column'][local-name='{}']".format(field_name)
return root_xml.find(xpath)


def _is_used_by_worksheet(names, field):
return any((y for y in names if y in field.worksheets))


class FieldDictionary(MultiLookupDict):
def used_by_sheet(self, name):
# If we pass in a string, no need to get complicated, just check to see if name is in
# the field's list of worksheets
if isinstance(name, basestring):
return [x for x in self.values() if name in x.worksheets]

# if we pass in a list, we need to check to see if any of the names in the list are in
# the field's list of worksheets
return [x for x in self.values() if _is_used_by_worksheet(name, x)]


def _column_object_from_column_xml(root_xml, column_xml):
field_object = Field.from_column_xml(column_xml)
local_name = field_object.id
metadata_record = _get_metadata_xml_for_field(root_xml, local_name)
if metadata_record is not None:
field_object.apply_metadata(metadata_record)
return _ColumnObjectReturnTuple(field_object.id, field_object)


def _column_object_from_metadata_xml(metadata_xml):
field_object = Field.from_metadata_xml(metadata_xml)
return _ColumnObjectReturnTuple(field_object.id, field_object)


class ConnectionParser(object):
def __init__(self, datasource_xml, version):
self._dsxml = datasource_xml
self._dsversion = version
Expand Down Expand Up @@ -52,11 +107,13 @@ def __init__(self, dsxml, filename=None):
self._connection_parser = ConnectionParser(
self._datasourceXML, version=self._version)
self._connections = self._connection_parser.get_connections()
self._fields = None

@classmethod
def from_file(cls, filename):
"Initialize datasource from file (.tds)"
dsxml = ET.parse(filename).getroot()
"""Initialize datasource from file (.tds)"""

dsxml = xml_open(filename).getroot()
return cls(dsxml, filename)

def save(self):
Expand All @@ -72,7 +129,8 @@ def save(self):
"""

# save the file
self._datasourceTree.write(self._filename, encoding="utf-8", xml_declaration=True)

xfile._save_file(self._filename, self._datasourceTree)

def save_as(self, new_filename):
"""
Expand All @@ -85,7 +143,7 @@ def save_as(self, new_filename):
Nothing.
"""
self._datasourceTree.write(new_filename, encoding="utf-8", xml_declaration=True)
xfile._save_file(self._filename, self._datasourceTree, new_filename)

###########
# name
Expand All @@ -107,3 +165,28 @@ def version(self):
@property
def connections(self):
return self._connections

###########
# fields
###########
@property
def fields(self):
if not self._fields:
self._fields = self._get_all_fields()
return self._fields

def _get_all_fields(self):
column_field_objects = self._get_column_objects()
existing_column_fields = [x.id for x in column_field_objects]
metadata_only_field_objects = (x for x in self._get_metadata_objects() if x.id not in existing_column_fields)
field_objects = itertools.chain(column_field_objects, metadata_only_field_objects)

return FieldDictionary({k: v for k, v in field_objects})

def _get_metadata_objects(self):
return (_column_object_from_metadata_xml(x)
for x in self._datasourceTree.findall(".//metadata-record[@class='column']"))

def _get_column_objects(self):
return [_column_object_from_column_xml(self._datasourceTree, xml)
for xml in self._datasourceTree.findall('.//column')]
Loading

0 comments on commit 59ac21c

Please sign in to comment.