Index > ApiNATOMY model RDF export and deployment Edit on GitHub

ApiNATOMY model RDF export and deployment

Table of Contents

ApiNATOMY to RDF/OWL2 conversion

Basic strategy

JSON -> RDF -> OWL2
Conversion from json to rdf should have few if any changes in semantics.
Conversion from a direct rdf translation to OWL2 is where the semantic
translation of ApiNATOMY structures into OWL2 constructs will happen.

Server setup

On the ontology host (read, ttl host, not SciGraph host) you will need the following.

mkdir /var/www/sparc/ApiANTOMY
mkdir /var/www/sparc/ApiANTOMY/archive
mkdir /var/www/sparc/ApiANTOMY/archive/manual
mkdir /var/www/sparc/ApiANTOMY/ontologies
chown -R nginx:nginx /var/www/sparc/ApiANTOMY

SciGraph pipeline

Everything

function apinat-full-to-prod () {
    echo this is not ready
    return 1

    apinat-build-all &&
    pushd ~/git/apinatomy-models
    git commit &&
    popd
    echo This would deploy: $(git diff --name-only HEAD~1..HEAD | cut -d'/' -f 1 | sort -u)

    apinat-deploy-ttls $(git diff --name-only HEAD~1..HEAD | cut -d'/' -f 1 | sort -u)
    ~/git/pyontutils/nifstd/scigraph/bin/run-load-graph-sparc-data &&
    ~/git/pyontutils/nifstd/scigraph/bin/run-deploy-graph-sparc-data
}

function apinat-full-to-dev () {
    apinat-build-all &&
    ~/git/pyontutils/nifstd/scigraph/bin/run-load-graph-sparc-data-dev &&
    echo TODO figure out the dance we have to do for docker may need a separate function
}

Parallel build

For the time being (until a bit more machinery is in place for orthauth style configuration in elisp) you can put the path to secrets.sxpr in ~/.emacs.d/orgstrap-init.el with the following expression (setq oa-secrets "/path/to/secrets.sxpr").

function apinat-build-all () {
    pushd ~/git/apinatomy-models
    pushd models
    local ids=$(ls -d *)
    popd
    echo $ids | \
    xargs -P10 -r -I {} \
        bash -c 'apinat-build --repo $(pwd) --model-id ${@}'" ${@}" _ {}

    # only add models where the json has changed since the xlsx is opaque
    git add -u $(git status --porcelain | grep json | grep M | cut -d' ' -f3 | cut -d'/' -f1-2)
    popd
}
(let ((default-directory apinat-model-repo))
  (cl-remove-if-not
   (lambda (s) (string-match ".json$" s))
   (split-string
    (run-command "git" "diff" "--name-only" "HEAD" "HEAD~1" "--" "models"))))

Export to ttl

r-apin-mod:HEAD:{model-id}/

YEAH it DOES support arbitrary locations, with the %s implicitly at the tail by default

gsx:google-sheet-id

(defvar-local apinat-model-repo "~/git/apinatomy-models/"
  "Path to a local copy of the apinatomy-models repository.")

(defvar-local apinat-converter-command "apinat-converter"
  "Command name or full path to javascript apinatomy converter.")

(defvar-local apinat-converter-debug nil
  "Dynamic variable used to control debug behavior.")

(defun-local ex-do (&rest nothing)
  "Executor do. Eats the input since it is to be run by the executor."
  ;; maybe use this to resolve the ex-come-from flows?
  ;; while loop would be annoying here
  (yes-or-no-p "Step done? "))

(defun-local ex-label (label) ;; call/cc, prompt, or cl condition handling is likely the abstraction we would want
  "Target acquired!")

(defun-local ex-come-from (label)
  ;; detangling what is going on when using this, there is an implicit assertion
  ;; that a linear set of `ex-do' steps result in the same final state as a series
  ;; of sequential calls, the reason to use come-from is that the conductor program
  ;; can return and go on to the next step as expected (though a post condition)
  ;; measurement ala a contract needs to be confirmed, the conductor continues
  ;; along to the next step, and the extracted and compiled `ex-do' (which is really
  ;; a macro) command uses the come-from to stich the dependency tree back together
  ;; for the real world steps that are most efficient, the problem with this is that
  ;; you have to prove that the ordering for both graphs is compatible, namely that
  ;; for a particular set of sequential calls that there is only a single come-from
  ;; per line, otherwise you have to figure out what it means to have 20 things that
  ;; can all in princilple happen after the completion of a step, which is possible
  ;; but would need a solution
  ;;
  ;; scheduling, queue theory, with the additional notion of affinity, because human
  ;; executors are not interchangable in the same way as a cpu
  ;; I think you model it as generic + specialzed, and you schedule specialized
  ;; first because the pool for those is limited
  "AAAAAAAAA!")

(defmacro apinat--with-model (model-id &rest body)
  (declare (indent defun))
  `(let ((default-directory (expand-file-name
                             (symbol-name ,model-id)
                             (expand-file-name "models" apinat-model-repo)))
         (xlsx    (format "source/%s.xlsx"    model-id))
         (json    (format "source/%s.json"    model-id))
         (jsong   (format "derived/%s.json"   model-id))
         (jsonld  (format "derived/%s.jsonld" model-id))
         (ttl     (format "derived/%s.ttl"    model-id)))
     ,@body))

(defun-local flow-to-ttl (model-id)
  ;; probably need 1 more level of indirection to handle cases where the model-id
  ;; will be put inside the models/ folder which should probably happen sooner rather than later
  ;;(let ((default-directory (expand-file-name model-id (git-share-local r-apin-mod:HEAD:))) ; TODO ))
  '
  (apinat--with-model model-id
    ;; TODO make sure the directories exist
    ;; TODO resume from previous step on failure <- this is critical
    ;; check out `file-newer-than-file-p' as a reasonable approach ala make
    ;; the only issue is how to prevent the xlsx retrieval from notching it up
    ;; every single time, maybe we can compare checksums on the xlsx file?
    ;; TODO push the model id further down the call chain since input
    ;; and output paths are defined by convention
    (funcall (checksum-or #'-mx->)  model-id   xlsx) ; source/{model-id}.xlsx
    ;;; FIXME TODO message about opening the open physiology viewer
    (-xjl-> xlsx json jsonld)
    ;;(funcall (out-or      #'-xj->)  xlsx       json) ; source/{model-id}.xlsx source/{model-id}.json
    ;;(funcall (out-or      #'-jl->)  json     jsonld) ; source/{model-id}.json derived/{model-id}.jsonld
    (funcall (out-or      #'-lt->)  jsonld      ttl) ; derived/{model-id}.jsonld derived/{model-id}.ttl
    )

  (apinat--with-model model-id
    (-mxjjl-> model-id xlsx json jsong jsonld)
    (-lt-> jsonld ttl)))

(defun-local checksum-or (fun)
  (lambda (thing path-out)
    ;;(message "dd: %s" default-directory)
    (if (file-exists-p path-out)
        (let ((path-temp (let ((temporary-file-directory default-directory))
                           ;; keep the temp nearby rather than where ever the tfd is
                           (make-temp-file (concat path-out "._maybe_new_")))))
          (unwind-protect
              (progn
                (funcall fun thing path-temp)
                (let ((checksum-new (securl-path-checksum path-temp))
                      (checksum-old (securl-path-checksum path-out)))
                  (unless (string= checksum-new checksum-old)
                    (rename-file path-temp path-out t))))
            (when (file-exists-p path-temp)
              (delete-file path-temp))))
      (funcall fun thing path-out))))

(defun-local out-or (fun)
  (lambda (path-in path-out)
    (when (and (file-exists-p path-in)
               (or (not (file-exists-p path-out))
                   (and (file-exists-p path-out)
                        (file-newer-than-file-p path-in path-out))))
      (funcall fun path-in path-out))))

;; model id -> some path

(defun-local -mt-> (model-id path-ttl) "`model-id' to `path-ttl'")
(defun-local -ml-> (model-id path-jsonld) "`model-id' to `path-jsonld'")
(defun-local -mxjjl-> (model-id path-xlsx path-json path-jsong path-jsonld)
  "`model-id' to everything except the ttl"
  (let ((path-xlsx (concat default-directory "/" path-xlsx))
        (path-json (concat default-directory "/" path-json))
        (path-jsong (concat default-directory "/" path-jsong))
        (path-jsonld (concat default-directory "/" path-jsonld))
        (path-temp (make-temp-file "apinat-conversion" 'directory)))
    (unwind-protect
        (let* ((default-directory path-temp)
               (google-sheet-id (symbol-name (-ms-> model-id)))
               (path-internal (concat path-temp "/build")) ; should not exist to avoid date suffix
               (_ (let (backtrace-on-error-noninteractive ; we know where the error happens
                        (rc-argv (list
                                  apinat-converter-command
                                  "-f" "id"
                                  "-t" "xlsx"
                                  "-t" "json"
                                  "-t" "json-resources"
                                  "-t" "json-flattened"
                                  (if apinat-converter-debug "-t" "")
                                  (if apinat-converter-debug "json-ld" "")
                                  "-i" google-sheet-id
                                  "-o" path-internal)))
                    (when apinat-converter-debug
                      (message "command: %s" (string-join rc-argv " ")))
                    (apply #'run-command rc-argv))))
          (cl-loop for path in (list path-xlsx path-json path-jsonld) do
                   (let ((parent (file-name-directory path)))
                     (unless (file-directory-p parent)
                       (make-directory parent t))))
          (rename-file (concat path-internal "/" "model.xlsx") path-xlsx t)
          (rename-file (concat path-internal "/" "model.json") path-json t)
          (rename-file (concat path-internal "/" "model-generated.json") path-jsong t)
          (rename-file (concat path-internal "/" "model-flattened.jsonld") path-jsonld t))
      (unless apinat-converter-debug
        (delete-directory path-temp 'recursive)))))

(defun-local -mj-> (model-id path-json) "`model-id' to `path-json'")

;; intermediate steps for model id

(defun-local -m-lt-> (model-id)
  (apinat--with-model model-id
    (funcall (out-or #'-lt->) jsonld ttl)))

(defun-local -m-x-> (model-id)
  (apinat--with-model model-id
    (-mx-> model-id xlsx)))

(defun-local -ms-> (model-id)
  (oa-path :google :sheets (if (keywordp model-id)
                               model-id
                             (intern (format ":%s" model-id)))))

(defun-local -mx-> (model-id path-xlsx)
  ;; automated
  (let* ((google-sheet-id (-ms-> model-id))
         (url (format "https://docs.google.com/spreadsheets/d/%s/export?format=xlsx" google-sheet-id)))
    ' ; it probably makes more sense to implement stuff like this using the condition system?
    ;; in terms of flow control for a DAG you try to do the thing,
    ;; stop at your first error and then go do the dependency? but in
    ;; reality there is often an explicit step where all checks must
    ;; pass before the whole process can continue because of some time
    ;; constraint or similar
    (ex-do (message "Make sure that the permissions are set correctly on %s" url))
    ;; NOTE `url-copy-file' cannot detect login redirects correclty
    ;; google sends a 307 for the download if everything is going to work
    ;; in curl it sends a 302 but never something in the 400 range
    ;; ideally we would be able to (run-command "mimetype" path-xlsx)
    ;; but that requires that users have the mimetype command avaiable
    (url-copy-file url path-xlsx t)))

(defun-local -xj-> (path-xlsx path-json)
  "This is currently a manual step."
  (let (;(open-physiology-viewer "file:///home/tom/git/open-physiology-viewer/dist/test-app/index.html")
        (open-physiology-viewer "https://open-physiology-viewer.surge.sh/"))
    ;; TODO conditional open only if not already
    ;;(browse-url open-physiology-viewer)
    ;;(run-command "google-chrome-unstable" open-physiology-viewer)
    (ex-do (message "open file (left top folder) to upload to viewer from %s" path-xlsx)
           (message "save file (left bottom floppy) to download from viewer to %s" path-json)
           (ex-label 'viewer-after-open))))

(defun-local -jl-> (path-json path-jsonld)
  "Currently a manual step."
  (ex-do (ex-come-from 'viewer-after-open) ; This is amazing.
         ;; Allows decoupling of functional spec from the actual execution in the real world.
         ;; As a bonus we get to use my all time favorite control flow structure.
         (message "export flattened json-ld (right 2nd from bot white doc) to download from viewer to %s"
                  path-jsonld)))

(defun-local -xjl-> (path-xlsx path-json path-jsonld)
  ;; yay automated NOTE requires nodejs and open-physiology-viewer
  (let ((path-xlsx (concat default-directory "/" path-xlsx))
        (path-json (concat default-directory "/" path-json))
        (path-jsonld (concat default-directory "/" path-jsonld))
        (path-temp (make-temp-file "apinat-conversion" 'directory)))
    (unwind-protect
        (let* ((default-directory path-temp)
               (_ (run-command apinat-converter-command "-m" "xlsx" "-i" path-xlsx))
               (output-dir (car (directory-files default-directory nil "converted-*"))))
          ;; '("model-flattened.jsonLD" "model-generated.json" "model.json" "model.jsonLD")
          (rename-file (concat output-dir "/" "model.json") path-json t)
          (rename-file (concat output-dir "/" "model-flattened.jsonLD") path-jsonld t))
      (delete-directory path-temp 'recursive))))

(defun-local -lt-> (path-jsonld path-ttl)
  ;; automated
  (let (backtrace-on-error-noninteractive)
    (run-command (or (executable-find "pypy3") ; beware missing libs
                     (executable-find "python"))
                 "-m" "sparcur.cli" "apinat" path-jsonld path-ttl)))
(defun-local update-models (model-ids) ; vs &rest model-ids
  ;; FIXME mapcar is inadequate for handling parallel processes that
  ;; might have `ex-do' parts
  (mapcar #'flow-to-ttl model-ids))

(defun-local all-models ()
  ;;(let ((default-directory (git-share-local r-apin-mod:HEAD:)) ; TODO ))
  (let ((default-directory (expand-file-name "models" apinat-model-repo)))
    ;; you could use something like model-repository but then you have to make
    ;; a bunch of concatentations, better just to switch the default directory
    ;; so that the context deals with alignment between name and local referent
    (cl-remove-if (lambda (p)
                    (or (not (file-directory-p p))
                        (string-prefix-p "." p)
                        (not (file-exists-p (concat p "/source/" p ".json")))))
                  (directory-files default-directory))))

(defun apinat--ttl-newer (model-id)
  (apinat--with-model model-id
    (let ((mtimes
           (mapcar (lambda (p)
                     (string-to-number
                      (format-time-string
                       "%s"
                       (file-attribute-modification-time (file-attributes p)))))
                   (list ttl xlsx))))
      (message "%S" mtimes)
      (and (file-exists-p ttl)
           (apply #'> mtimes)))))

;; TODO apinat--remote-older or something

(defun apinat--all-except (except)
  (cl-remove-if (lambda (id) (memq id except)) (mapcar #'intern (all-models))))

(defun-local filter-recent-models (model-ids)
  (cl-remove-if #'apinat--ttl-newer model-ids))

(defun-local update-all-models (&optional skip-recent)
  (update-models (if skip-recent
                     (filter-recent-models (mapcar #'intern (all-models)))
                   (mapcar #'intern (all-models)))))

(defvar-local apinat-process-results nil)

(defun sentinel (process message &optional stderr-process)
  (when (memq (process-status process) '(exit signal))
    (let ((ex (process-exit-status process))
          (buf (process-buffer process))
          (cmd (process-command process)))
      (setq apinat-process-results
            (cons (list ex (with-current-buffer buf (buffer-string)))
                  apinat-process-results))
      (if (= ex 0)
        (message "completed: %S" process)
        (warn "command failed with %s: %s" ex (string-join cmd " "))
        (warn "stdout: %S stderr: %S"
              (with-current-buffer buf (buffer-string))
              (and stderr-process
                   (with-current-buffer (process-buffer stderr-process)
                     (buffer-string))))))))

(defun-local update-all-models-async ()
  (let ((model-ids (all-models)))
    (message "updating all models %s" model-ids)
    (cl-loop
     for model-id in model-ids
     collect
     ;; TODO consider whether we can somehow use invocation-name invocation-directory so that
     ;; specific versions of emacs are used to run the block instead of always the system version
     (ow-run-command-async "sh" :sentinel #'sentinel (buffer-file-name) "--model-id" model-id "--secrets" oa-secrets "--repo" apinat-model-repo
                           (if apinat-converter-debug "--debug" "")))
    (while (< (length apinat-process-results) (length model-ids))
      (sleep-for 5)
      (message "complete: %s/%s" (length apinat-process-results) (length model-ids)))))
;; FIXME do fetch all in one batch so we don't have
;; to wait for the ttl export between each model
(update-all-models t)
' ; or pick your own models
(update-models '(vagus-nerve))
' ; jsonld -> ttl conversion
(-m-lt-> 'vagus-nerve)

if error clone the repo

pushd ~/git
git clone https://github.com/open-physiology/apinatomy-models.git

if model-id error then we need to set the model ids in secrets but in reality need to overwrite the defniition of -ms-> is easier right now

missing derived folders

pushd ~/git/apinatomy-models/
New-Item -Path * -Name derived -ItemType "directory"
pushd ~/git/apinatomy-models/
find -maxdepth 1 -type d -not -path '*.git*' -not -path '.' -exec mkdir {}/derived \;

Deploy ttl

After running the ttl export define the functions in ref:deploy-ontology-file and then run apinat-deploy-from-ttl bronchomotor.ttl. NOTE Both functions need to be defined.

(defvar apinat--remote-onts-path ; TODO source from config
  "/ssh:cassava|sudo:nginx@cassava:/var/www/sparc/ApiNATOMY/ontologies/")

(defun apinat--write-to-remote (model-id)
  "Deploy a single apinatomy model MODEL-ID to `apinat--remote-onts-path'."
  (apinat--with-model model-id
    (with-current-buffer (find-file-noselect ttl 'no-warn 'rawfile)
      ;; FIXME this has inverted nesting if we want to start depositing
      ;; multiple different models
      (unwind-protect
          (let* (;(version "TODO datetime etc, or better, read it from the ttl file")
                 (version (int-to-string (time-convert nil 'integer)))
                 (version-path-local
                  (concat (symbol-name model-id) "/" version "/" (file-name-nondirectory ttl)))
                 (version-path
                  (expand-file-name
                   version-path-local
                   apinat--remote-onts-path))
                 (latest-path
                  (expand-file-name
                   (file-name-nondirectory ttl)
                   apinat--remote-onts-path))
                 (tramp-histfile-override "/dev/null"))
            ;; if the version already exists, error
            ;; otherwise make it via tramp
            (if (file-directory-p version-path)
                ;; XXX this call to `file-directory-p' outside where
                ;; `tramp-histfile-override' is bound to t is critical
                ;; to prevent the tramp ssh sudo chain from truncating
                ;; your .bash_history file, see
                (error "Version exists!")
              (let ((tramp-histfile-override t))
                ;; system users have no home, tramp will error without override t
                ;; XXX WARNING if `tramp-histfile-override' is set to t and somehow
                ;; the scope of a call affects a local file path or runs on the
                ;; local system it WILL truncate the history file for the current user ;_;
                (make-directory (file-name-directory version-path) 'parents)
                (write-file version-path)
                ;;(message "%S %S" version-path-local latest-path)
                (make-symbolic-link
                 version-path-local
                 latest-path
                 'ok-if-already-exists))))
        (kill-buffer (current-buffer))))))

(defun apinat-deploy-models (model-ids)
  ;; FIXME check already deployed
  (let (fails)
    (cl-loop for model-id in model-ids
             do (condition-case nil
                    (apinat--write-to-remote model-id)
                  (error (push model-id fails))))
    fails))

The current command to deploy all is.

for f in $(ls models/*/derived/*.ttl); do echo apinat-deploy-from-ttl $f; done

Alternately use the following to deploy specific models.

date -Is -d "@$(find | awk -F'/' '{ print $3 }' | sort | tail -n 1)"
pushd "${REPO}" 2>&1 > /dev/null
git log --name-only --since "${LAST_DEPLOY}" --pretty="format:" | grep json | cut -d'/' -f2 | sort -u

Always run apinat-deploy-ttls from inside apinat-model-repo

apinat-deploy-ttls $(git diff --name-only HEAD~1..HEAD | cut -d'/' -f 2 | sort -u)

If you add a new model you will need to update the imports in https://cassava.ucsd.edu/ApiNATOMY/ontologies/sparc-data.ttl. The update process should be automated as part of the workflows described here. See also https://github.com/SciCrunch/sparc-curation/blob/master/resources/scigraph/ontologies-sparc-data.yaml.

function apinat-remote-operations () {
    local PATH_SOURCE="${1}"
    local PATH_TARGET="${2}"
    local PATH_LINK="${3}"
    local FILE_NAME_TTL=$(basename -- "${PATH_TTL}")
    local DIR_LINK="$(dirname "${PATH_LINK}")"
    local LINK_TARGET="$(realpath -m --relative-to="${DIR_LINK}" "${PATH_TARGET}")"
    mkdir -p "$(dirname "${PATH_TARGET}")"
    chown nginx:nginx "${PATH_SOURCE}"
    mv "${PATH_SOURCE}" "${PATH_TARGET}"
    # FIXME we need to fail if the source path does not exist otherwise we end in broken state
    unlink "${PATH_LINK}"
    ln -s "${LINK_TARGET}" "${PATH_LINK}"
}

function apinat-deploy-from-ttl () {
    # TODO loop over positional argument paths, but retain a single ssh command
    local PATH_TTL="${1}"  # FIXME careful with this, never allow a user to set the source path
    local DATE=$(date +%s)  # FIXME source from the ontology directly? better to spend time implementing OntResIriWrite
    local HOST_APINAT_ONTOLOGY=cassava
    local FILE_NAME_TTL=$(basename -- "${PATH_TTL}")
    local NAME_TTL="${FILE_NAME_TTL%.*}"
    local PATH_REMOTE_TARGET_BASE=/var/www/sparc/ApiNATOMY/ontologies/
    local VERSION_PATH="${NAME_TTL}/${DATE}/${FILE_NAME_TTL}"
    local PATH_REMOTE_SOURCE="/tmp/${FILE_NAME_TTL}"
    local PATH_REMOTE_TARGET="${PATH_REMOTE_TARGET_BASE}${VERSION_PATH}"
    local PATH_REMOTE_LINK="${PATH_REMOTE_TARGET_BASE}${FILE_NAME_TTL}"

    # FIXME also notify host for sudo
    local SUDO_OR_SU='$(command -v sudo 1>& 2 && echo sudo ${0} -c || { echo For su on ${HOSTNAME} 1>& 2; echo su -c; })'

    # TODO ensure that apinat-remote-operations is defined
    rsync --rsh ssh "${PATH_TTL}" ${HOST_APINAT_ONTOLOGY}:"${PATH_REMOTE_SOURCE}"
    ssh -t ${HOST_APINAT_ONTOLOGY} "${SUDO_OR_SU} '$(typeset -f apinat-remote-operations); apinat-remote-operations \
\"${PATH_REMOTE_SOURCE}\" \
\"${PATH_REMOTE_TARGET}\" \
\"${PATH_REMOTE_LINK}\"'"
}

function apinat-deploy-ttls () {
    # TODO do it in batch, derive the timesamps correctly etc.
    for id in $@; do
        apinat-deploy-from-ttl "models/${id}/derived/${id}.ttl"
    done
}

Check https://cassava.ucsd.edu/ApiNATOMY/ontologies/ for success if needed.

spc report changes \
--ttl-file https://cassava.ucsd.edu/ApiNATOMY/ontologies/keast-bladder/1620348301/keast-bladder.ttl \
--ttl-compare https://cassava.ucsd.edu/ApiNATOMY/ontologies/keast-bladder/1617055182/keast-bladder.ttl

Load and deploy graph

Then run run-load-deploy-graph-sparc-data to load and deploy in one shot.

An example run is

~/git/pyontutils/nifstd/scigraph/bin/run-load-graph-sparc-data
~/git/pyontutils/nifstd/scigraph/bin/run-deploy-graph-sparc-data

NPO identifiers

minimal listing

Run this and commit the output to the neurons branch of the ontology.

import rdflib
import augpathlib as aug
from pyontutils.core import OntGraph, OntResPath
from pyontutils.config import auth
from pyontutils.namespaces import *
repo_relative_path = 'ttl/generated/neurons/apinatomy-neuron-populations.ttl'
uri_base = 'https://raw.githubusercontent.com/SciCrunch/NIF-Ontology/neurons/'
oid = rdflib.URIRef(uri_base + repo_relative_path)

def _genlabel(oid):
    x = (oid.rsplit("-", 3)[-3]
         if oid.endswith('-prime')
         else (oid.rsplit("-", 3)[-3] if 'unbranched' in oid else oid.rsplit("-", 2)[-2]))
    l = (oid.rsplit("-", 3)[-3]
         if oid.endswith('-prime')
         else (' '.join(oid.rsplit("-", 3)[-3:-1]) if 'unbranched' in oid else oid.rsplit("-", 2)[-2]))
    t = (oid.rsplit("-", 2)[-2] + "'"
         if oid.endswith('-prime')
         else (oid.rsplit("-", 1)[-1] if 'unbranched' in oid else oid.rsplit("-", 1)[-1]))

    # XXX manual fix for consistency
    if x == 'keast':
        x = 'kblad'
    if l == 'keast':
        l = 'kblad'

    label = f'neuron type {l} {t}'
    return label, x

def genlabel(oid):
    return _genlabel(oid)[0]

g = OntGraph()
g.populate_from_triples(
    ((oid, p, o) for p, o in
     ((rdf.type, owl.Ontology),
)))

skip = 'small-intestine',
amr = aug.LocalPath("~/git/apinatomy-models").expanduser()
models = [c for c in (amr / 'models').children if c.name != 'too-map']
mgraphs = []
for m in models:
    if m.name in skip:
        continue
    for p in (m / 'derived' / (m.name + '.ttl'),):
        if p.exists():
            try:
                mgraphs.append(OntResPath(p).graph)
            except Exception as e:
                raise Exception(p) from e

for mg in mgraphs:
    mg.namespace_manager.populate(g)
    for s in mg[:rdf.type:elements.OntologyTerm]:
        if 'readable/neuron-type' in s:  # FIXME haaaack
            g.add((s, rdfs.subClassOf, ilxtr.NeuronEBM))
            g.add((s, rdfs.label, rdflib.Literal(genlabel(s))))

olr = aug.LocalPath(auth.get('ontology-local-repo')).expanduser()
g.write(olr / repo_relative_path)

Dynamic cypher queries

NOTE: this section contains temporary instructions. This should really be done on a development instance of data services. Sometimes it is faster to edit /ssh:aws-scigraph-data-scigraph:services.yaml directly. Use the following command to restart services to load the updated dynamic queries.

ssh aws-scigraph-data sudo systemctl restart scigraph

When you have a query working as desired add it or update it in cypher resources.

See also data services build and deploy.

Add new ApiNATOMY model to SciGraph load

Edit sparc-data.ttl and add a new line to the second owl:import statement.

ApiNATOMY model server specification

Intro

While an ApiNATOMY server has been on the roadmap for some time, there have not been clear requirements and use cases to drive the development in a way that is productive. As the conversion of ApiNATOMY models to RDF has progressed, some of the requirements and use cases have presented themselves and helped to solidify a set of initial use cases. The need to integrate knowledge represented in ApiNATOMY into the larger linked data space provides some initial requirements which are the that the server be able to provide persistent and resolvable identifiers for ApiNATOMY models, and that it be able to provide high granularity access to the version history of these models. In addition, we are ultimately aiming for the server to be able to automatically convert input models or spreadsheets into generated models and resource maps. We have mapped out three phases for arriving at this end goal. The first phase is to be able to resolve input models, the second is to be able to upload and link the generated model and resource map and track which input model they came from. These two will address our primary short-term needs.

To accomplish this, the plan is to use git (via GitHub) as the primary datastore for the models. This will allow us to leverage the significant existing infrastructure around GitHub for version control, collaboration, review, content hosting, and backup. In front of this there will be a server that provides resolvable persistent identifiers for ApiNATOMY models so that the identifiers appearing in the linked data graphs will be resolvable and interoperable with the rest of the NIF-Ontology search and discovery tooling.

In the future as part of the third phase we can work towards automating the conversion of input models, and it might also be possible to have the server automatically convert and serve the RDF version of the models as well.

A brief outline of the initial requirements needed to meet the needs of the RDF conversion pipeline are documented below.

Architecture diagram

apinatomy-server-diagram.png Legend.

Solid lines initial
Dashed lines soon
Dotted lines later
Dashed dotted lines dataflow

https by default

url structure

apinatomy.org

  • /

    landing page to maximize engagement

    existing index page links to the various git repositories youtube introduction wikipedia page and similar funding papers use cases

  • /viewer

    the open-physiology-viewer a search entry point could also appear here

  • /dashboard

    list of anatomical entities that have been used to annotate number of models etc. from the queries.org file somehow link this into search?

  • /docs/{page}
    • /docs/source.html

      move under /docs

    • /docs/identifiers.html

      generated via esdoc, issues with angular components

  • /docs/manual

    currently at /manual out of date coming from open-physiology-viewer/manual markdown files

  • /uris
    • /uris/models/{model-id}

      need a landing page that has all the model metadata and would allow users to open the model in the viewer with a single click, maybe even have a static image of the model rendered on the page

    • /uris/models/{model-id}.{ext}

      how to deal with json/ttl and model, generated, map

    • /uris/models/{model-id}/ids/{local-id}
    • /uris/models/{model-id}/snapshot/{snapshot-id}

      Implies that snapshots are always associated with a single model, so that if there are multiple models they should be imported into a single top level file.

      need some way to resolve snapshot files

    • /uris/readable/{string}
    • /uris/elements/{string}
  • TODO json-ld context

transformed models/copies need to be able to point back to the exact commit

for deposition on blackfynn, export to scigraph, etc. the source model hash needs to be separat

Serve the JSONLD context

return authoring metadata

store the source model

have endpoint for resource-map and generated

overlap with loading in the client

load all formats from local

google sheets import

load from a url

Reporting

import json
import augpathlib as aug
from pyontutils.core import OntGraph
from pyontutils.namespaces import rdf, owl


def path_json(string):
    with open(string, 'rt') as f:
        return json.load(f)


def main():
    graph = OntGraph()
    apinat_models = aug.RepoPath('~/git/apinatomy-models').expanduser()
    [graph.parse(f) for f in apinat_models.rglob('*.ttl')]
    # rdf
    n_trip = len(graph)
    n_class = len(set(graph[:rdf.type:owl.Class]))
    n_ind = len(set(graph[:rdf.type:owl.NamedIndividual]))
    # json
    js = [path_json(p) for p in apinat_models.rglob('*.json')]
    keys = ('publications', 'nodes', 'links', 'lyphs', 'materials', 'chains', 'groups')
    n_obj = sum([sum([len(j[k])
                      if k in j else 0 for k in keys])
                 for j in js])
    n_pair = sum([sum([sum([len(o) for o in j[k]])
                       if k in j else 0 for k in keys])
                  for j in js])
    print(f'''rdf
trip:  {n_trip}
class: {n_class}
ind:   {n_ind}

json
obj:   {n_obj}
obj:   {n_pair}''')
    return [['Type', 'Authored', 'Expanded'],
            ['Individual', n_obj, n_ind],
            ['Statement', n_pair, n_trip],
            ['owl:Class', 'n/a', n_class],]
Type Authored Expanded
Individual 1714 25940
Statement 8274 318378
owl:Class n/a 395

External links

Date: 2022-12-21T16:18:44-05:00

Author: Tom Gillespie

Created: 2022-12-22 Thu 01:38

Validate